w w w . P r o v a t o S y s . c o m
M i c h a e l   C a t a l a n i
9 0 1 . 5 8 1 . 8 7 9 1

 

               


       Work With Job Activation Groups

   

Play Download Video

Download Steps For WRKJAG
(V6R1 or Later)

  1. Download the following file (WRKJAG.SAVF) by clicking HERE
     
  2. When prompted as to whether to save or open the file, specify to save it.
     
  3. Specify where on your pc you would like to save the file.  
     
  4. Create a savefile on your AS400 / iSeries.  I suggest a save file called WRKJAG in QGPL.
    CRTSAVF QGPL/WRKJAG
     
  5. Get a command box on your pc by using
    START / RUN / COMMAND
     
  6. At the prompt, key in
    FTP {system name}
    where system name is the name or ip address of your AS400 / iSeries.
     
  7. When prompted, enter your User Profile
     
  8. When Prompted, enter your password
     
  9.  Type in BIN
     You will be told that the representation type is binary.
     
  10. Type in PUT
     
  11. The system will prompt you for the local file, which is where you saved the WRKJAG.SAV during the download step. For example, if you saved the file to the root directory on the c: drive, then enter
    c:\wrkjag.savf
     
  12. You will be prompted for the remote file. This is the path and file for the save file on your system. If you used QGPL/WRKJAG for the save file in step 4 above, then key in
    /QSYS.LIB/QGPL.LIB/WRKJAG.SAVF
     
  13. The WRKJAG programs and command are in a library called WRKJAG. Restore the WRKJAG library. If you want to rename the WRKJAG library, then change the  RSTLIB keyword to the new library name.
    RSTLIB SAVLIB(WRKJAG) DEV(*SAVF) SAVF(QGPL/WRKJAG)
     
  14. Add the WRKJAG library to your library list (or the library you specified on the RSTLIB command).
     
  15. On the command line, enter
    WRKJAG

 

WRKJAG is a command that allows you to analyze a job's activation groups.  With this command, you can:
  • Display all jobs on the system, in a similar format as WRKACTJOB, but with activation group information.
  • Receive immediate feedback if WRKJAG detects an ILE program running in a job's DAG. (Default Activation Group)
  • Drill down into a job to see the individual activation groups and details.
  • Drill into each activation group to see the individual programs, service programs, and modules that are running in the activation group.

NOTE: Due to some of the programming techniques used, I can only restore the utility program back to V6R1. Also, the programs must have sufficient authority to  display system jobs AND to display their activation group details.  By default, the 3 programs (JAGSUMR, ANZJACTR, ANZACTR) are owned by QSECOFR.  If insufficient authority is granted, then the jobs may not be displayed, or will display with no details for the activation group details.

After following the download steps to the left, add the library for the WRKJAG command into your library list.  Then call the command by keying in WRKJAG. (You can also specify a subsystem on the command;  ie:  WRKJAG QINTER )

The following is the initial screen for the WRKJAG command.  This is program JAGSUMR.

THE SUBSYSTEM AND IGNORE QSYSWRK FIELDS AT THE TOP OF THE SCREEN

There are two input fields at the top of the screen.  The first is Subsystem.  This is where you can key in a specific subsystem to display.

The IGNORE QSYSWRK SUBSYSTEM field is set to a default of "Y", which is why you dont see this subsystem when the screen is initially displayed, even though the default is to display all subsystems.  The reason for this is that the api's that WRKJAG uses to collect activation group data sometimes gets delayed when a job is in a certain state, which can cause a delay on the screen being refreshed.  Many of the jobs in QSYSWRK cause a significant delay, and since we normally don't care about these jobs anyway, the default is to ignore  this subsystem so that the screen displays faster.  You can always change this option to "N" if you want to display this subsystem.  (And you can see the delay in processing when you do.)

 

COLUMN HEADINGS

The jobs and subsystem monitors are displayed in a similar fashion to WRKACTJOB.  For example, the first jobs listed is called backup.  This is a subsystem monitor job.  The second job is called CPJUNGLEBU, and this is a job that is running in the BACKUP subsystem. The USER and JOB NUMBER are to the left of the job name.

The ACT GROUPS column shows how many activation groups there currently are for the job.

The ACTIVATIONS column shows the total number of activations (programs / service programs / modules ) that have been activated for the job.

The TOTAL STATIC SIZE column shows the total amount of memory that the job is utilizing to hold stuff such as program variables and open data paths. (This value does not include the size of the program object itself, as program objects can be shared across multiple jobs. )

The HEAPS column shows how many heaps there inside the job.

The HEAP STORAGE shows the total size of all of the heaps inside a job.

OPTIONS

Option 4 - This option will be used for a future enhancement. Currently, option 4 will remove a job from the screen. It does NOT end a job, or otherwise do anything to the job on the system, it simply removes the job from the display.  (You can put a 4 next to a few jobs to see it in action.  You can get the jobs back on the screen by pressing F5)

Option 5 - This will call program ANZJACTR, which will display a list of the jobs activation groups.  (This program is covered in a section further down this page)

COMMAND KEYS

F3-Exit the program

F5-Refresh the screen.  This causes all of the data for all of the jobs to be recalculated and displayed.

F4-Displays a window subfile of all subsystems which are currently running on the system.

NOTE:

If you see the following message, you will need to sign off and sign back on in order to clear the activation group monitor.

The reason for this error is that the activation group api monitor job gets placed in an invalid state. This typically happens if the job is canceled in the process of gathering activation group data.  There is no way I have found to reset the monitor.  (The WRKJAG runs in a *NEW activation group, which would normally clear any errors in previous job activation since the previous activation group is deleted, but it does not work this way for the activation group monitor.  Once it gets hung up for a job, it remains hung up until the job is ended.)

You can duplicate this error by specifying the WRKJAG to display all subsystems, changing the QSYSWRK flag to "N", and pressing F5. Once the job starts processing, kill the program with a System Request option 2.  As soon as you run the WRKJAG command again, you will likely see this message. 

 

DETECTING AN ILE PROGRAM IN THE DAG

In the above display, we can see that job QPADEV000H is red. (and it will blink as well) This tells us that this job has an ILE program running in the DAG.  We will cover this more in another section later on this page.

OPTION 5 - DISPLAY ACTIVATION GROUPS

From the display below, we can see some sumary information for the job TIGERTROPH.  This job has 7 activation groups, with a total of 527 activations. The total static size the job is utilizing is just over 18MB. The job has 6 heaps that utilize 598Kb of storage.

 Let's place a 5 next to this job in order to display the activation groups in more detail. We get the display below:

This is program ANZJACTR, and from this display we can see the 7 activation groups the job currently has assigned to it. The first two activation groups are named *DFTACTGRP.  All jobs are assigned *dftactgrp #1 and *dftactgrp #2 when they start on the system.  *Dftactgrp #1 is for *system stuff.   *Dftactgrp #2 is for *user stuff, such as program variables for our applications. The *user portion of the *dftactgrp is known as the DAG.

In the first screen we saw that this job was utilizing over 18MB of static storage.  This screen says that 13MB of static storage is contained within the activation group called CATALOG. We can use option 5 next to this activation group in order to see the details of it.

This is program ANZACTR, and from this display we can see the programs and service programs that have activations in the CATALOG activation group.  From here, we can see that the program (*PGM) Catalog in library TROPHYCGI is utilizing nearly 1.3MB of static storage, and the service program (*SRVPGM) Trophy / Catalog_S is utilizing nearly 11.5MB of static storage.

Note: Just because a program or service program utilizes a lot of static storage does not mean that it is poorly designed or written.  However, we don't want jobs to keep activation groups and the static storage they require if they aren't needed anymore, as this causes the job to use more memory than it really requires.  By using WRKJAG, we can tell if certain programs may need to reclaim their activation groups more frequently than they are currently designed.  And if our system appears to have performance issues related to memory, WRKJAG allows us to quickly analyze the programs that are currently running, see the memory requirements each of these jobs have, and possibly spot programs that we could make a quick change to in order to clean up unneeded activation groups. 

 

SORTABLE COLUMNS

Each of the columns  of program ANZACTR are sortable by simply double clicking on them. The static size column sorts in descending order.

 

ILE IN THE DAG

An ILE program gets placed into the DAG when both of the following occurs:

  1. An ILE program has an activation group specified as *caller
  2. The ILE program is called from an OPM program (or command line)

If an ILE program specifies *caller for its activation group, it's telling the operating system that the activation of this program is to be placed in the calling programs activation group. Since OPM programs run in the DAG,  ILE programs with *caller which are called by OPM programs are then activated into the DAG.

NOTE: Unfortunately, IBM utilizes the term "ILE" in multiple ways.  As an application designer, a program is ILE if makes use of ILE features such as activation groups and exported subprocedure calls. In short, a program is truly ILE capable if the DftActGrp keyword is set to *NO.   Unfortunately, if you use the DSPPGM command, any program that is compiled with the RPGIV compiler has a flag set to "ILE". 

Even if a program has a type of "ILE", as long as the DftActGrp keyword is set to (*YES), then it is essentially an OPM program, and will execute just like an OPM program in the DAG.  There is no difference between an OPM RPGIV program and an RPGIII program in the way the execute in the DAG.  But if a RPGIV program has a keyword of DftActGrp(*NO), then that program truly is an ILE program, capable of making use of the ILE features.  This type of program, if it gets routed to the DAG, is the issue we will look at here.  Just keep in min that even though the DSPPGM command says that a program has a type of ILE, that simply means that it is an RPGIV, not that it is capable of utilizing the ILE.

 

There are two primary problems that occur when ILE programs get routed into the DAG.

  1. Override Scoping
  2. Performance

Override Scoping

The ILE offers a much more robust method of performing overrides than the OPM environment.  A problem occurs when a program, which was designed for ILE overrides, is forced to operate in OPM mode in the DAG.  When this occurs, the overrides do not perform as they would in ILE mode, which means the override scoping may fail to operate the way they were designed to.

Performance

All programs utilize static storage within the job to hold program variables, open data paths, overrides, etc.  In OPM mode, this static storage is contained within the DAG.  In ILE, they are contained within other activation groups, either named activation groups or *NEW activation groups.  With ILE, an entire activation group can be easily removed from the job, thereby removing all static storage used by all programs and service programs that were activated within that activation group.  For example, a program may make calls to subprocedures located in dozens of service programs.  The total static storage that gets assigned within the activation group can be quite large, especially since recent RPGIV enhancements allow very large fields and data structures to be defined.   The ILE program design may require massive amounts of static storage for a job to execute, but as soon as the job is done with the program, all of the static storage can easily be removed from the job by ending the activation group. 

When an ILE program gets sucked into the DAG, we can no longer remove the static storage for the ILE programs OR the service programs that get drug into the DAG with them.  This is because we can not perform a RCLACTGRP (reclaim activation group) on the DAG, like we can with other activation groups.  And although the RCLRSC command will remove OPM activations from the DAG, it will not remove ILE activations within the DAG. The only way to remove an ILE activation from the DAG is to remove the DAG itself, and the only way to do that is to end the job.

As far as performance is concerned, when an ILE program drags its static storage requirements ( and that of service programs and modules that it binds to ) into the DAG, it can bring dozens or hundreds of megabytes worth of static storage requirements with it. Running in ILE, this isnt a problem, because we can purge these static storage requirements as soon as we're done with them.  But when a job sucks these ILE static storage requirements into its DAG, they remain for the life of the job. When you compound this across many jobs that may be sucking ILE programs into their DAG, it can have a serious impact on performance, as these jobs are now seriously bloated with ILE activations that are hung in the DAG. Every time the job needs to run, and it has ILE activations into the DAG, it has to carry around the massive amounts of static storage with it, which means the job needs to have that amount of memory available to it in the subsystem. 

In summary,  the basic performance issue with ILE programs in the DAG is that the job must carry around this excess static storage requirement, even though it may never need to utilize any of it again for the life of the job.  This requires the job to utilize much more memory than it needs to, which over-inflates the actual amount of memory required.  This problem is exaggerated with long running jobs (since they require the additional memory for the life of the job), and with the number of jobs that are dragging ILE programs into the DAG.

The WRKJAG command will detect jobs which have ILE programs in their DAG. See the display below:

 

Job QPADEV000J is blinking in red, so it has an ILE program which has been activated in its DAG.  Let's take an option 5 next to this job to take a closer look.

Once we press 5 next to the job, we will get the display for the ANZJACTR program, which lists out all of the activation groups for the job.  Since we are interested in finding the ILE program(s) that are in the DAG, we now need to display *dftactgrp / activation group #2.  

All jobs are assigned a default activation group when they start. And the default activation group is subdivided into a system portion and a user portion.  The user portion is the one we are concerned about, as that is where the our applications' static storage variables will reside if the program is operating in the DAG. 

In summary, *dftacrgrp actgrp #2 is the same thing as the DAG. When we talk about programs running in the DAG, we are specifically talking about the *user portion of the *dftactgrp.

Ok, so let's put a 5 next to the DAG to display the details for it.

Now we are in program ANZACTR.  An ILE program that is in here will display in blinking red.  (NOTE: Service programs will not blink red, even though they are ILE capable. This is because I wanted WRKJAG to locate the original ILE program that sucked into the DAG.  Any ILE program may drag with it any number of service programs with it the service programs specify *caller for their activation group.  But service programs can't get sucked into the DAG in of themselves, they require an ILE program to access them and drag them in with it. 

In this example, we can see that the program TESTACT is an ILE program.   Now, we need to research which OPM program called TESTACT.  The offending OPM program may not be in the DAG anymore. This is because an OPM program will have its activation removed from the DAG as soon as it returns with LR indicator turned on.  ILE programs will not operate this way, and remain in the DAG for the life of the job.  So it is not uncommon to have an ILE program hung in the DAG, and the original OPM program that called it has been removed already.

At this point, we simply need to scan OPM programs on our system to find out which ones are calling TESTACT.  And then we can either change the OPM program to ILE. (Or, we may be able to change the TESTACT program to a named or *new activation group, which would also solve the problem. )