Topic
  • 6 replies
  • Latest Post - ‏2012-03-06T15:58:33Z by Vish_Pat
Vish_Pat
Vish_Pat
7 Posts

Pinned topic Compute Grid PJM queuries.

‏2012-03-05T07:51:00Z |
Hi,

We are using CG 6.1 in our project, while going through some documentation, i have some queries and it will be helpfull if these can be clarified.

1) Transaction in PJM:- Since the PJM spans a Top level job that moniters all the Subjobs, is the actual(JTA) transaction started at the start of a top level jobs? or each subjob runs in a seperate JTA transaction?

2) Since the checkpointing is provided at subjob level,if the no of parallel jobs to be spawn is e.g "5" and if the checkpointing is records based at "2", what will be an outcome of this approach? would the transaction will be committed every 2 records processed by each subjobs.

3) Apart from PJM, i was not able to find sql scripts for LREE(I managed to find LRSCHED scripts).

Thanks in Advance
Vishal
Updated on 2012-03-06T15:58:33Z at 2012-03-06T15:58:33Z by Vish_Pat
  • SystemAdmin
    SystemAdmin
    783 Posts

    Re: Compute Grid PJM queuries.

    ‏2012-03-05T13:09:58Z  
    Vishal,

    1) Transaction in PJM:- Since the PJM spans a Top level job that moniters all the Subjobs, is the actual(JTA) transaction started at the start of a top level jobs? or each subjob runs in a seperate JTA transaction?

    Each subjob runs in its own JTA transaction. The Top Level Job can use the Synchronization interface for begin/commit/rollback semantics of a logical transaction that demarcates the execution of all the subjobs. I.e. begin is called before subjobs are submitted; commit or rollback is called after all subjobs end. The Synchronization interface provides a control point to execute compensation logic to backout (or accept) updates made by the subjobs. Note this Synchronization is not a JTA transaction; no resources are actually enlisted.

    2) Since the checkpointing is provided at subjob level,if the no of parallel jobs to be spawn is e.g "5" and if the checkpointing is records based at "2", what will be an outcome of this approach? would the transaction will be committed every 2 records processed by each subjobs.

    If you have 5 subjobs that are configured to checkpoint every 2 seconds, then you will have 5 threads, each doing JTA transactions and committing checkpoints every 2 seconds. That means 5 JTA begins every 2 second and 5 JTA commits every 2 seconds, and so on, until all 5 threads exhaust all their input data.

    3) Apart from PJM, i was not able to find sql scripts for LREE(I managed to find LRSCHED scripts).

    You should find both LRS and LREE ddls in the longRunning folder. If you find only LRS you may be looking at a WAS 8 install. In WAS 8, we combined the LRS and LREE ddl.

    -Chris
  • Vish_Pat
    Vish_Pat
    7 Posts

    Re: Compute Grid PJM queuries.

    ‏2012-03-05T17:12:51Z  
    Vishal,

    1) Transaction in PJM:- Since the PJM spans a Top level job that moniters all the Subjobs, is the actual(JTA) transaction started at the start of a top level jobs? or each subjob runs in a seperate JTA transaction?

    Each subjob runs in its own JTA transaction. The Top Level Job can use the Synchronization interface for begin/commit/rollback semantics of a logical transaction that demarcates the execution of all the subjobs. I.e. begin is called before subjobs are submitted; commit or rollback is called after all subjobs end. The Synchronization interface provides a control point to execute compensation logic to backout (or accept) updates made by the subjobs. Note this Synchronization is not a JTA transaction; no resources are actually enlisted.

    2) Since the checkpointing is provided at subjob level,if the no of parallel jobs to be spawn is e.g "5" and if the checkpointing is records based at "2", what will be an outcome of this approach? would the transaction will be committed every 2 records processed by each subjobs.

    If you have 5 subjobs that are configured to checkpoint every 2 seconds, then you will have 5 threads, each doing JTA transactions and committing checkpoints every 2 seconds. That means 5 JTA begins every 2 second and 5 JTA commits every 2 seconds, and so on, until all 5 threads exhaust all their input data.

    3) Apart from PJM, i was not able to find sql scripts for LREE(I managed to find LRSCHED scripts).

    You should find both LRS and LREE ddls in the longRunning folder. If you find only LRS you may be looking at a WAS 8 install. In WAS 8, we combined the LRS and LREE ddl.

    -Chris
    Thanx U Chris,

    1) My point 2 was misread as "Time based" check-pointing, the question was more for "Record Based" check pointing.
    2) I wanted a suggestion, Is it OK for a parametizer connect to DB for fetching the no of sub jobs to spawn... and pass property object to each sub job.

    Thanx
    Vishal
  • SystemAdmin
    SystemAdmin
    783 Posts

    Re: Compute Grid PJM queuries.

    ‏2012-03-05T17:27:52Z  
    • Vish_Pat
    • ‏2012-03-05T17:12:51Z
    Thanx U Chris,

    1) My point 2 was misread as "Time based" check-pointing, the question was more for "Record Based" check pointing.
    2) I wanted a suggestion, Is it OK for a parametizer connect to DB for fetching the no of sub jobs to spawn... and pass property object to each sub job.

    Thanx
    Vishal
    Vishal,

    Well, I don't know how I got seconds out of your question :) If your checkpoint is every 2 records, then each of your subjobs will read 2 records, process 2 records and then commit. so yes, each thread would checkpoint (commit) after every 2 records. The key is each subjob uses the same checkpoint policy by default. Of course, you could paramterize each subjob via substitution if you wanted a different interval (or policy) for different subjobs. In my experience, that is not typical. Additionally, I would suggest a checkpoint interval of 2 records is very fine-grained.

    Regarding the parameterizer attaching to a DB. Yes, that's fine. Just remember the TLJ itself is running in its own checkpoint transaction and that the parameterizer is called in the scope of that JTA transaction. So consider your TLJ checkpoint timeout in relation to how time consuming your parameterizer database access is.

    Also, while we're discussing timeouts, let me remind you your app server's max timeout must be set greater than or equal to your greated checkpoint interval timeout.

    -Chris
  • Vish_Pat
    Vish_Pat
    7 Posts

    Re: Compute Grid PJM queuries.

    ‏2012-03-06T05:41:27Z  
    Vishal,

    Well, I don't know how I got seconds out of your question :) If your checkpoint is every 2 records, then each of your subjobs will read 2 records, process 2 records and then commit. so yes, each thread would checkpoint (commit) after every 2 records. The key is each subjob uses the same checkpoint policy by default. Of course, you could paramterize each subjob via substitution if you wanted a different interval (or policy) for different subjobs. In my experience, that is not typical. Additionally, I would suggest a checkpoint interval of 2 records is very fine-grained.

    Regarding the parameterizer attaching to a DB. Yes, that's fine. Just remember the TLJ itself is running in its own checkpoint transaction and that the parameterizer is called in the scope of that JTA transaction. So consider your TLJ checkpoint timeout in relation to how time consuming your parameterizer database access is.

    Also, while we're discussing timeouts, let me remind you your app server's max timeout must be set greater than or equal to your greated checkpoint interval timeout.

    -Chris
    that helped a lot,

    Few other doubts, As i mentioned we are using CG 6.1 for developement, we are facing few problems related to develpoment environment.
    I don't have RAD for development nor do i have a WAS with CG on Windows(Local mac),WAS is on remote AIX machine.
    My development is mostly pulling jar from the WAS on AIX and doing development on Eclipse(Windows Local machine),i use the batch packaging utility for creating EAR and then deploy and test it on WAS(AIX).

    Problems faced,

    1) Batch packager creates EAR by reading a property file but the batch packager assumes that your application will only have a EJB component and not a WAR or web component, the application.xml created by packager doesnt have an WEB entry to it , there are no options to do it either. (Well you can do it, if you change the templates that are read by the packager), Is there any other solution to this, some property tweak or so?

    2) The EAR created by the packager has backend folder having some mapping/association for each DB,what is the folder for? If i open any folder other than Derby, Oracle for example i see some file names having "NULL" value. why is this so?
    Does the batch packager assumes that default DB will be a Derby? if so can i change it?

    3) The ejb-jar.xml created by batch packager has an JNDI entry for Lree DB as "jdbc/lree" can this be changed before invoking batch packager.

    4) Is the approach that i have used for deveploment fine? (well this should have been the first question :))

    Thanx Vishal
  • SystemAdmin
    SystemAdmin
    783 Posts

    Re: Compute Grid PJM queuries.

    ‏2012-03-06T15:26:37Z  
    • Vish_Pat
    • ‏2012-03-06T05:41:27Z
    that helped a lot,

    Few other doubts, As i mentioned we are using CG 6.1 for developement, we are facing few problems related to develpoment environment.
    I don't have RAD for development nor do i have a WAS with CG on Windows(Local mac),WAS is on remote AIX machine.
    My development is mostly pulling jar from the WAS on AIX and doing development on Eclipse(Windows Local machine),i use the batch packaging utility for creating EAR and then deploy and test it on WAS(AIX).

    Problems faced,

    1) Batch packager creates EAR by reading a property file but the batch packager assumes that your application will only have a EJB component and not a WAR or web component, the application.xml created by packager doesnt have an WEB entry to it , there are no options to do it either. (Well you can do it, if you change the templates that are read by the packager), Is there any other solution to this, some property tweak or so?

    2) The EAR created by the packager has backend folder having some mapping/association for each DB,what is the folder for? If i open any folder other than Derby, Oracle for example i see some file names having "NULL" value. why is this so?
    Does the batch packager assumes that default DB will be a Derby? if so can i change it?

    3) The ejb-jar.xml created by batch packager has an JNDI entry for Lree DB as "jdbc/lree" can this be changed before invoking batch packager.

    4) Is the approach that i have used for deveploment fine? (well this should have been the first question :))

    Thanx Vishal
    Vishal,

    Your basic approach is fine. RAD is not strictly required. The BatchPackager only packages batch POJOs and automates EAR creation. If you want to add additional Java EE features to your batch application, you need to use a Java EE development tool like RAD or Eclipse Java EE. You can import the EAR file from the batch packager, edit it, then export it again. Technically, you can build EAR files by hand, but who wants to do that? The BatchPackager does not assume Derby. It adds deployment meta-data to enable all WAS-supported databases. You pick your backend database when you install your app. The app install process also let's you bind a different jndi name - jdbc/lree is simply the default.

    -Chris
  • Vish_Pat
    Vish_Pat
    7 Posts

    Re: Compute Grid PJM queuries.

    ‏2012-03-06T15:58:33Z  
    Vishal,

    Your basic approach is fine. RAD is not strictly required. The BatchPackager only packages batch POJOs and automates EAR creation. If you want to add additional Java EE features to your batch application, you need to use a Java EE development tool like RAD or Eclipse Java EE. You can import the EAR file from the batch packager, edit it, then export it again. Technically, you can build EAR files by hand, but who wants to do that? The BatchPackager does not assume Derby. It adds deployment meta-data to enable all WAS-supported databases. You pick your backend database when you install your app. The app install process also let's you bind a different jndi name - jdbc/lree is simply the default.

    -Chris
    Chris,

    That clears most of my doubts,but why Oracle files in backend folder have "NULL" names in them whereas if i see derby folder the file names are coming normal?

    In the previous query you mentioned about TLJ running in a global transaction, If i do an "SELECT" followed by a DML "UPDATE" in a prameterizer, would the update be committed to DB before an subjob is started ? i guess the answer is yes?

    2) If One of the subjob fails and as a result TLJ also is in a failed state, if i restart a TLJ will the Paramterizer be called on restart --> along with restart method of job step?

    --Vishal