Monday, March 11, 2024

PSFT_PLAN: A Sample Oracle Database Resource Manager Plan for PeopleSoft

In the cloud (or any virtualised environment), performance is instrumented as cost.  This is also true in any other on-premises environment, but it takes a lot longer to feedback!
  • If you never run out of CPU, then you have probably bought/rented/allocated/licensed too many CPUs.
  • If you do run out of CPU, then you should use the database resource manager to prioritise the processes that are most important to the business.
  • If you don't enable the resource manager, you will have less visibility of when you do run out of CPU.
At the very least, you can use one of the sample resource manager plans installed in the Oracle database by default.  
This article proposes a resource plan for PeopleSoft systems.  It can be used as a starting point before enhancing it with your own specific requirements.  

Resource Plan Design Goals

The purpose of a database resource plan is to prioritise important/urgent processes over less important/less urgent processes by allocating CPU, to the higher priority processes, and by restricting CPU, other resources, and the degree of parallelism for lower priority processes.
The design of a resource plan should reflect what the business defines as important.  

Consumer Groups

A resource plan consists of several resource groups with different priorities, and resource allocations. Each priority level defined by the business becomes a consumer group in the resource plan.  A consumer group can be allocated to one of 8 priority levels in a resource plan.  Multiple consumer groups can exist at the same priority level with different CPU guarantees (adding up to not more than 100%) and can include other limits.
I have made some assumptions about process priorities in a typical PeopleSoft system, and have grouped and ranked them in the table below starting with the highest priority.  Not all customers run all these processes.  Consumer groups and mappings that are not needed can be omitted. There are gaps in the priority levels to allow for other definitions to be introduced.
Priority Level Consumer Group %CPU Guarantee Comment
1SYS
_GROUP
100% Oracle system processes. Defined automatically.
2PSFT
_GROUP
100% Any process that connects to the database as either SYSADM (the default PeopleSoft owner ID) or PS has higher priority than other processes unless other rules apply. The online application (other than ad hoc query) falls into this category so that the online user experience is safeguarded before other PeopleSoft processes.
This includes remote call Cobol processes, but not remote call Application Engine that should be run in the component processor.
4BATCH
_GROUP
100% Process scheduler processes, and processes run by the process schedulers
5NVISION
_GROUP
100% nVision (NVSRUN) and nVision report book (RPTBOOK) processes
6PSQUERY
_ONLINE
_GROUP
90%Ad hoc queries are allocated to one of three consumer groups with the same priority, but different CPU guarantees, comprising:
  • on-line PS/Queries,
  • nVision reports run through the PIA,
PSQUERY
_BATCH
_GROUP
9%
  • PS/Queries run on the process scheduler using the PSQUERY application engine. A 4-hour maximum runtime limit is defined.
NVSRUN
_GROUP
1%
  • nVision through the 3-tier nVision client
8LOW
_GROUP
1%Other low-priority processes
LOW
_LIMITED
_GROUP
1%Other low-priority processes, but whose maximum query time is limited.
OTHER
_GROUPS
1%All other processes.  Defined automatically.

Consumer Group Mapping Priority

Sessions are allocated to the consumer groups.  They can be allocated explicitly, or via mapping rules that use various session attributes. As the attributes are set or changed, the consumer group will be set according to the matching rules.  
I have set the following attributes to be mapped in the following order of precedence.  The more specific mappings take precedence over the more generic ones.
PriorityMapping Attribute Comment
2Module, ActionThe PIA instrumentation sets attributes MODULE to the component name and ACTION to the page name. Specific component pages are allocated to specific consumer groups
3ModuleSpecific scheduled processes are allocated by name to specific consumer groups.  PeopleSoft instrumentation puts this name in the MODULE attribute.
4Client ProgramBatch and query processes are identified by program name and allocated to certain consumer groups.
5Oracle UserAnything that connects to the database as either SYSADM or PS is allocated to the PSFT_GROUP. So other mapping rules must take precedence over this mapping.

Required PeopleSoft Configuration

The PSFT_PLAN sample resource manager plan relies on MODULE and ACTION being set by the PeopleSoft Application.  Therefore, the following additional configuration is required.
  • Enable PeopleSoft instrumentation: Set EnableAEMonitoring=1 in ALL PeopleSoft application server and process scheduler domains so that PeopleSoft processes set MODULE and ACTION information in the session attributes (using DBMS_APPLICATION_INFO).  
See also:
  • Install instrumentation trigger for PeopleSoft (psftapi.sql).  Not all PeopleSoft processes are instrumented.  COBOL, SQR, and nVision do not set MODULE or ACTION.  When a PeopleSoft process is started by the process scheduler, the first thing it does is set its own status to 7, meaning that it is processing.  This script creates a database trigger that fires on that DML and sets the session attributes MODULE to the name of the process and ACTION to the process instance number.  Application Engine processes may then subsequently update these values again.

Consumer Group Mappings

Consumer groups are matched to session attributes.  The highest priority matching mapping is applied.  Mappings can be matched to literal values, or with LIKE or REGEXP_LIKE operations.
Mapping Priority Attribute Value Consumer
Group
Priority
Consumer Group
2MODULE_ACTIONQUERY_MANAGER.QUERY_VIEWER6PSQUERY_ONLINE_GROUP
3MODULERPTBOOK
NVSRUN
5NVISION_GROUP
PSQRYSRV%6PSQUERY_ONLINE_GROUP
PSAE.PSQUERY.%6PSQUERY_BATCH_GROUP
4CLIENT_PROGRAMPSRUNRMT2PSFT_GROUP
psae%
PSAESRV%
PSDSTSRV%
PSMSTPRC%
PSRUN@%
PSSQR%
pssqr%
sqr%
4BATCH_GROUP
PSQRYSRV%6PSQUERY_ONLINE_GROUP
PSNVSSRV%6NVSRUN_GROUP
SQL Developer
sqlplus%
Toad%
8LOW_GROUP / LOW_LIMITED_GROUP
5ORACLE_USERPS
SYSADM
2PSFT_GROUP

Resource Plan Script

Two SQL scripts are available on GitHub

Other Options

There are other resource manager options that are either not illustrated in the sample plan, or that are commented out.  They may be worth considering in some situations.

  • PeopleSoft does not use parallel query by default, but if you do use it, you may well want to limit which processes use how much parallelism.  Consumer groups can specify a limit to the parallel query degree.
    • If you use the resource plan to restrict the degree of parallelism, and you also plan to vary the number of CPUs in a cloud environment, then I suggest creating a resource plan for each number of CPUs and switch between the plans by changing the setting of  the RESOURCE_MANAGER_PLAN parameter.

  DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
    'PSFT_PLAN', 'NVISION_GROUP', 'nVision Reports.'
    ,mgmt_p5 => 100
    ,parallel_degree_limit_p1=>2
  );
  • A parallel query may queue waiting to obtain sufficient parallel query server processes.  A timeout can be specified to limit that wait and to determine the behaviour when the timeout is reached.  The query can either be cancelled raising error ORA-07454, or run at a reduced parallelism).

  DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
    'PSFT_PLAN', 'PSQUERY_ONLINE_GROUP'
    ,mgmt_p6 => 90
    ,parallel_queue_timeout=>900
    ,pq_timeout_action=>'RUN'
  );
  • A consumer group can restrict queries that run for a long time, or that are expected to run for a long time based on their optimizer cost.  They can be switched to the CANCEL_SQL group after a number of seconds and they will terminate with ORA-00040: active time limit exceeded - call aborted:.  This has only specified for the LOW_LIMITED_GROUP, and the PSQUERY_BATCH_GROUP for scheduled queries because the message is captured by the process scheduler and logged.  It has not been specified for PSQUERY_ONLINE_GROUP because this error is not handled well by the online application.  Just the Oracle error message will be displayed to the user without further explanation, which is neither friendly nor helpful.  Instead, there are PeopleSoft configuration options to limit query runtime.
DBMS_RESOURCE_MANAGER.CREATE_PLAN_DIRECTIVE(
    'PSFT_PLAN', 'PSQUERY_BATCH_GROUP'
    ,mgmt_p6 => 1
    ,switch_group => 'CANCEL_SQL'
    ,switch_time => 14400
    ,switch_estimate => TRUE 
    ,switch_for_call => TRUE
    );
  • Sometimes customers may have different priorities and different priorities at different times that cannot be satisfied by a single resource plan.  In which case, different resource plans can be activated at different times by different scheduler windows. 

Other Online Resources

Tuesday, February 20, 2024

What PS/Query is that?

Sometimes, performance analysis will turn up a problem SQL query that is probably a PS/Query. However, I need to know which PS/Query it is should I wish to alter it or talk to the user who wrote it. 

Is it a PS/Query?

It is quite easy to spot SQL queries that are generated from queries defined in the PS/Query tool. These are typical characteristics:

  • Single character row source aliases (eg. A, B, D) 
  • The same row source with a suffix 1 (eg. D1) for query security records.
  • Effective date/sequence subqueries are always correlated back to the same table.
  • Order by column position number rather than column names or aliases.
Sometimes, you may find SQL that looks like a PS/Query coming from other parts of PeopleSoft because a developer has copied the text of a PS/Query, usually into an Application Engine step.
SELECT A.EMPLID, A.ATTENDANCE, A.COURSE, B.DESCR, D.NAME, A.SESSION_NBR,
TO_CHAR(A.STATUS_DT,'YYYY-MM-DD'),B.COURSE
FROM PS_TRAINING A, PS_COURSE_TBL B, PS_PERSONAL_DTA_VW D, PS_PERS_SRCH_QRY D1
WHERE D.EMPLID = D1.EMPLID
AND D1.ROWSECCLASS = 'HCDPALL'
AND ( A.COURSE = :1
AND A.ATTENDANCE IN ('S','W')
AND A.COURSE = B.COURSE
AND A.EMPLID = D.EMPLID )

The text of a PS/Query is not stored in the database.  Instead, as with other objects in PeopleSoft, it is held as various rows in PeopleTools tables.  The PSQRY% tables are used to generate the SQL on demand.  We can query these tables to identify the query.  

PSQRYRECORD holds a row for every record referenced in the query (not including effective date/sequence subqueries).  My usual tactic is to write a SQL query on PSQRYRECORD, like the one below, that looks for PS/Queries that reference these tables with these table aliases (see PeopleSoft for the Oracle DBA, Chapter 11).  
REM findqry.sql
REM (c)Go-Faster Consultancy 2012

SELECT a.oprid, a.qryname
FROM   psqryrecord a
,      psqryrecord b
,      psqryrecord d
WHERE  a.oprid = b.oprid
AND    a.qryname = b.qryname
AND    a.oprid = d.oprid
AND    a.qryname = d.qryname
AND    a.corrname = 'A'
AND    a.recname = 'TRAINING'
AND    b.corrname = 'B'
AND    b.recname = 'COURSE_TBL'
AND    d.corrname = 'D'
AND    d.recname = 'PERSONAL_DTA_VW'
/
The example PS/Query above is TRN003__COURSE_WAITING_LIST from the HCM demo database.  However, my query on PSQRYRECORD found another PS/Queries with the same 3 records using the same row source aliases.  It is worth looking at queries on the same tables as they often suffer from the same problems, and you might want to make the same fix.  
Another source of results for this query (though not this time) can be when users copy a public PS/Query to a private one so they can alter it in isolation.
OPRID                          QRYNAME
------------------------------ ------------------------------
                               TRN002__SESSION_ROSTER
                               TRN003__COURSE_WAITING_LIST

Writing the query on PSQRYRECORD to find queries, which always is slightly different each time, is quite boring.  So I have written a script that will dynamically generate the SQL to identify a PS/Query.

Start with a SQL_ID

A SQL tuning activity will usually identify the SQL_ID and plan hash value of a statement.  If you are lucky, AWR will have captured the text and execution plan.  If not, you may have to try looking for a different SQL_ID that produces the same execution plan.  From the statement text, it is easy to see whether it might be a PS/Query.  
In this example, I have cut the SQL statement and execution plan back to show just the tables and indexes referenced.
SQL_ID c3h6vf2w5fxgp
--------------------
SELECT …
FROM PSTREELEAF B, PSTREENODE C, PS_OPER_UNIT_TBL A, PS_PRODUCT_TBL G 
…
UNION SELECT …
FROM PSTREENODE D,PS_TREE_NODE_TBL E, PSTREELEAF F 
…

--------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                                   | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------------------
|  *  7 |        INDEX STORAGE FAST FULL SCAN         | PSBPSTREELEAF    |   426K|    19M|       |  1178   (1)| 00:00:01 |
|    10 |          TABLE ACCESS BY INDEX ROWID BATCHED| PS_PRODUCT_TBL   |     1 |    41 |       |     3   (0)| 00:00:01 |
|  * 11 |           INDEX RANGE SCAN                  | PS_PRODUCT_TBL   |     1 |       |       |     2   (0)| 00:00:01 |
|  * 14 |              INDEX RANGE SCAN (MIN/MAX)     | PS_PRODUCT_TBL   |     1 |    21 |       |     2   (0)| 00:00:01 |
|  * 15 |       TABLE ACCESS STORAGE FULL             | PSTREENODE       |   135K|  5709K|       |   663   (1)| 00:00:01 |
|  * 17 |       INDEX STORAGE FAST FULL SCAN          | PS_OPER_UNIT_TBL |  1791 | 35820 |       |     4   (0)| 00:00:01 |
|  * 20 |       INDEX RANGE SCAN (MIN/MAX)            | PS_PSTREENODE    |     1 |    33 |       |     3   (0)| 00:00:01 |
|  * 23 |       INDEX RANGE SCAN (MIN/MAX)            | PSAPSTREELEAF    |     1 |    32 |       |     3   (0)| 00:00:01 |
|  * 26 |       INDEX RANGE SCAN (MIN/MAX)            | PS_OPER_UNIT_TBL |     1 |    20 |       |     2   (0)| 00:00:01 |
|    33 |          TABLE ACCESS INMEMORY FULL         | PS_TREE_NODE_TBL | 35897 |  1647K|       |     6   (0)| 00:00:01 |
|  * 35 |          TABLE ACCESS STORAGE FULL          | PSTREENODE       |   167K|  9670K|       |   663   (1)| 00:00:01 |
|- * 36 |       INDEX RANGE SCAN                      | PS_PSTREELEAF    |     1 |    39 |       |  1267   (1)| 00:00:01 |
|    37 |      INDEX STORAGE FAST FULL SCAN           | PS_PSTREELEAF    |   480K|    17M|       |  1267   (1)| 00:00:01 |
|  * 40 |       INDEX RANGE SCAN (MIN/MAX)            | PS_PSTREENODE    |     1 |    33 |       |     3   (0)| 00:00:01 |
|  * 43 |       INDEX RANGE SCAN (MIN/MAX)            | PS_TREE_NODE_TBL |     1 |    28 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

…
   7 - SEL$1 / B@SEL$1
  10 - SEL$1 / G@SEL$1
  11 - SEL$1 / G@SEL$1
…
  15 - SEL$1 / C@SEL$1
  17 - SEL$1 / A@SEL$1
…
  33 - SEL$6 / E@SEL$6
  35 - SEL$6 / D@SEL$6
  36 - SEL$6 / F@SEL$6
  37 - SEL$6 / F@SEL$6
…

I use this query on DBA_HIST_SQL_PLAN to extract the tables that have single-character row source aliases that correspond to PeopleSoft records, and put them into PLAN_TABLE. I use this table because it is delivered by Oracle as a global temporary table, so it is always there and I can make use of it even if I only have read-only access.

INSERT INTO plan_table (object_name, object_alias) 
with p as ( --plan lines with single letter aliases
SELECT DISTINCT object_owner, object_type, object_name, regexp_substr(object_alias,'[[:alpha:]]',2,1) object_alias
from dba_hist_sql_plan p
, ps.psdbowner d
where p.sql_id = '&&sql_id' --put SQL ID here--
and p.object_name IS NOT NULL
and p.object_owner = d.ownerid
and regexp_like(object_alias,'"[[:alpha:]]"') --single character aliases
), r as ( --PeopleSoft table records and the table name
select r.recname, DECODE(r.sqltablename,' ','PS_'||r.recname,r.sqltablename) sqltablename
from psrecdefn r
where r.rectype = 0 --PeopleSoft table records
)
select r.recname, object_alias --referenced table
from p, r
where p.object_type like 'TABLE%'
and p.object_name = r.sqltablename
union --a query plan may reference an index and not the table
select r.recname, object_alias --table for referenced index
from p, r
, all_indexes i
where p.object_type like 'INDEX%'
and i.index_name = p.object_name
and i.owner = p.object_owner
and i.table_name = r.sqltablename
order by 2,1
/
I now have a list of records and row source aliases aliases
RECNAME         O
--------------- -
OPER_UNIT_TBL   A
PSTREELEAF      B
PSTREENODE      C
PSTREENODE      D
TREE_NODE_TBL   E
PSTREELEAF      F
PRODUCT_TBL     G

Next, I can run this anonymous PL/SQL block to dynamically build the SQL query on PSQRYRECORD (one reference for every table) and execute it to find the matching PS/Queries

DECLARE 
  l_sep1 VARCHAR2(20);
  l_sep2 VARCHAR2(20);
  l_counter INTEGER := 0;
  l_sql CLOB := 'SELECT r1.oprid, r1.qryname';
  l_where CLOB;
  
  TYPE t_query IS RECORD (oprid VARCHAR2(30), qryname VARCHAR2(30));
  TYPE a_query IS TABLE OF t_query INDEX BY PLS_INTEGER;
  l_query a_query;
BEGIN
  FOR i IN(
    SELECT *
    FROM plan_table
    ORDER BY object_alias
  ) LOOP
    l_counter := l_counter + 1;
    dbms_output.put_line(i.object_alias||':'||i.object_name);
    IF l_counter = 1 THEN
      l_sep1 := ' FROM ';
      l_sep2 := ' WHERE ';
    ELSE
      l_sep1 := ' ,';
      l_sep2 := ' AND ';
      l_where := l_where||' AND r1.oprid = r'||l_counter||'.oprid AND r1.qryname = r'||l_counter||'.qryname';
    END IF;
    l_sql := l_sql||l_sep1||'psqryrecord r'||l_counter;
    l_where := l_where||l_sep2||'r'||l_counter||'.corrname = '''||i.object_alias||''' AND r'||l_counter||'.recname = '''||i.object_name||'''';
  END LOOP;
  l_sql := l_sql||l_where||' ORDER BY 1,2';
  dbms_output.put_line(l_sql);

  EXECUTE IMMEDIATE l_sql BULK COLLECT INTO l_query;

  FOR indx IN 1 .. l_query.COUNT
  LOOP
    DBMS_OUTPUT.put_line (indx||':'||l_query(indx).oprid||'.'||l_query(indx).qryname);
  END LOOP;
END;
/

The seven records found in my execution plan become a query of PSQRYRECORD 7 times, one for each record, joined on operator ID and query name.

SELECT r1.oprid, r1.qryname 
FROM psqryrecord r1 ,psqryrecord r2 ,psqryrecord r3 ,psqryrecord r4 ,psqryrecord r5 ,psqryrecord r6 ,psqryrecord r7 
WHERE r1.corrname = 'A' AND r1.recname = 'OPER_UNIT_TBL'
AND r1.oprid = r2.oprid AND r1.qryname = r2.qryname AND r2.corrname = 'B' AND r2.recname = 'PSTREELEAF' 
AND r1.oprid = r3.oprid AND r1.qryname = r3.qryname AND r3.corrname = 'C' AND r3.recname = 'PSTREENODE' 
AND r1.oprid = r4.oprid AND r1.qryname = r4.qryname AND r4.corrname = 'D' AND r4.recname = 'PSTREENODE' 
AND r1.oprid = r5.oprid AND r1.qryname = r5.qryname AND r5.corrname = 'E' AND r5.recname = 'TREE_NODE_TBL' 
AND r1.oprid = r6.oprid AND r1.qryname = r6.qryname AND r6.corrname = 'F' AND r6.recname = 'PSTREELEAF' 
AND r1.oprid = r7.oprid AND r1.qryname = r7.qryname AND r7.corrname = 'G' AND r7.recname = 'PRODUCT_TBL' 
ORDER BY 1,2
The query finds several queries. I can look at the public PS/Queries in the Query Manager tool.  I can also see which users' private queries exist.
NB. You can only open public queries (where OPRID is a single space) or your own private queries.  In the Query Manager, you cannot see a private query owned by another user.
…
3: .PS_TREE_PRODUCT
4: .QUERY_PRODUCT_TREE
5: .RM_TREE_PRODUCT
6:XXXXXX.PS_TREE_PRODUCT_XX
…
The new findqry.sql script is available on Github.

Thursday, January 25, 2024

Reducing the Operating System Priority of PeopleSoft Processes

PeopleSoft for the Oracle DBA

I wrote about controlling the operating system priority of processes in PeopleSoft Tuxedo domains in Chapters 13 of 14 of PeopleSoft for the Oracle DBA, but I think it is worth a note here.

On Linux and Unix systems, the nice command can be used to lower the operating system scheduling priority of a process (or a privileged can increase the priority). When a server has no free CPU, processes with a lower priority get less time on the CPU. However, when there is free CPU available, the scheduling priority does not affect the amount of CPU that the process can utilise. 

On Unix, the priority of a Tuxedo server process can be adjusted using the -n server command line option in the configuration. The parameters to this option are simply passed through to the nice(2) function. Hence, this option does not work on Windows.

PSPRCSRV        SRVGRP=BASE
                SRVID=101
                MIN=1
                MAX=1
                RQADDR="SCHEDQ"
                REPLYQ=Y
                CLOPT="-n 4 -sInitiateRequest -- -C psprcs.cfg -CD HR88 -PS PSUNX -A start -S PSPRCSRV"
The operating system priority of a process is inherited from its parent. Therefore, lowering the priority of the Process Scheduler running under Tuxedo will also lower the priority of the batch processes that it spawns. 
  • Therefore Stand-alone Application Engine processes (psae) and Cobol processes inherit the priority of the process scheduler server process (PSPRCSRV).
  • However, if the Application Engine server process (PSAESRV) is used, its priority can be set directly. 
There are some potential uses for this approach.
  • If the process scheduler is co-resident with the application server, then it could be run at a lower priority to ensure the online users get preferential allocation of CPU, and that online performance does not suffer excessively at the hands of the batch.
  • A system might have two websites: one for self-service and the other for the 'back-office' users. You could configure separate application servers for each site, and run the self-service application server is run at a lower priority. 

In PeopleSoft, I prefer to create additional variables in the configuration file (psprcs.cfg).

[Process Scheduler]
;=========================================================================
; General settings for the Process Scheduler
;=========================================================================
PrcsServerName=PSUNX
;-------------------------------------------------------------------------
;Reduce priority of Process Scheduler server process, set to 0 if not needed
Niceness=4
...
From PeopleTools 8.4, the Application Engine server process is configured by default. The priority of the AE server processes can then be controlled independently of the process scheduler by creating a separate variable in the PSAESRV section of the configuration file.  However, it is generally better to use standalone PSAE, unless you have many short-lived application engine processes, as in CRM (see Application Engine in Process Scheduler: PSAESRV Server Process -v- Standalone PSAE executable).   
[PSAESRV]
;=========================================================================
; Settings for Application Engine Tuxedo Server
;=========================================================================
;-------------------------------------------------------------------------
;Reduce priority of application engine server process, set to 0 if not needed
Niceness=5
...
In this example, I have reduced the priorities of both the process scheduler and AE servers, but the process scheduler is left with a higher priority than the AE servers. The new variables can then be referenced Tuxedo template file (psprcsrv.ubx).
{APPENG}
#
# PeopleSoft Application Engine Server
#
PSAESRV         SRVGRP=AESRV
                SRVID=1
                MIN={$PSAESRV\Max Instances}
                MAX={$PSAESRV\Max Instances}
                REPLYQ=Y
                CLOPT="-n {$PSAESRV\Niceness} -- -C {CFGFILE} -CD {$Startup\DBName} -S PSAESRV"
{APPENG}
...
PSPRCSRV        SRVGRP=BASE
                SRVID=101
                MIN=1
                MAX=1
                RQADDR="SCHEDQ"
                REPLYQ=Y
                CLOPT="-n {$Process Scheduler\Niceness} -sInitiateRequest -- -C {CFGFILE} -CD {$Startup\DBName} -PS {$Process Scheduler\PrcsServerName} -A start -S PSPRCSRV"
When the domain is configured in psadmin, the variables are resolved in the Tuxedo configuration file (psprcsrv.ubb).  The -n option can be seen in the server command-line options (CLOPT).
#
# PeopleSoft Application Engine Server
#
PSAESRV         SRVGRP=AESRV
                SRVID=1
                MIN=1
                MAX=1
                REPLYQ=Y
                CLOPT="-n 5 -- -C psprcs.cfg -CD HR88 -S PSAESRV"
...
PSPRCSRV        SRVGRP=BASE
                SRVID=101
                MIN=1
                MAX=1
                RQADDR="SCHEDQ"
                REPLYQ=Y
                CLOPT="-n 4 -sInitiateRequest -- -C psprcs.cfg -CD HR88 -PS PSUNX -A start -S PSPRCSRV"

Friday, November 10, 2023

Prioritising Scheduled Processes by Operator ID/Run Control

Batch processing is like opera (and baseball) - "It ain't over till the fat lady sings".  Users care about when it starts and when it finishes.  If the last process finishes earlier, then that is an improvement in performance. 

This note describes a method of additionally prioritising processes queued to run on the process scheduler in PeopleSoft by their requesting operator ID and run control.  Where processing consists of more instances of the same process than can run concurrently, it can be used to make the process scheduler run longer-running processes before shorter-running processes that were scheduled earlier, thus completing batch processing earlier.

In PeopleSoft, without customisation, it is only possible to prioritise processes queued to run on the process scheduler by assigning a priority to the process definition or their process category.  Higher priority processes are selected to be run in preference to lower priorities.  Otherwise, processes are run in the order of the time at which they are requested to run.

Problem Statement

During an overnight batch, many nVision report books are scheduled to run on the Windows process schedulers by one of several specific batch operator IDs.  Many more reports are scheduled than can run concurrently, so some execute while others queue.  Inevitably, the reports have widely varying execution times.  The maximum concurrency of the nVision report book (RPTBOOK) process definition has been set, and the Oracle database resource manager has also been configured, to prevent too many of these processes from overloading the database.

CPU Utilisation of BatchThis chart shows the database activity when the batch runs.  We often see what has come to be called the 'long tail' while we wait for just a few long-running processes to complete.


The next chart shows the processing time of each nVision report process.  The blue bars run from when the started to when it ended, ordered by start time.   The clear boxes below run from the time when it was requested to when it started, thus showing the period for which the processes were queued on a process scheduler, but were blocked because the maximum number of processes were already processing.
Process Map (without prioritisation)
All these processes run with the same priority because they are the same process definition.  Some long-running jobs execute earlier in the batch simply because they were scheduled earlier, but others that started later, run on beyond the end of the batch.  
It would be better if the longest-running processes were executed earlier, irrespective of the order in which they were requested.  Thus the shorter processes can run later as slots on the scheduler become free, and thus all processes should finished both closer together and earlier.
There is nothing delivered in the PeopleSoft process scheduler configuration that will let you assign different priorities to different executions of the same process.  In PeopleSoft, only three priorities are defined PRCSPRIORITY (1=Low, 5=Medium, 9=High) on the process definition and the server category.  These priorities are transferred to the process request queue (PSPRCSQUE.PRCSPRTY).  
If we could put our own priority into that column we could control the priority of the request.  

Solution

Introduce a database trigger that fires on insert into PSPRCSQUE and sets a priority specified on a new metadata table.  PRCSPRTY is not validated by PeopleSoft, and therefore any value can be specified.  
The files are available in a Github repository davidkurtz/psprcspty. The exact metadata will vary with the use case and requirements, but I provided some examples of how it might be generated.

Metadata Table: PS_XX_GFCPRCSPRTY

We need a table that will hold the priority for each combination of process type, process name, operation ID, and run control ID.  A corresponding record should be created using the Application Designer project in the GitHub repository.

create table sysadm.ps_xx_gfcprcsprty
(prcstype  VARCHAR2(30 CHAR) NOT NULL
,prcsname  VARCHAR2(12 CHAR) NOT NULL
,oprid     VARCHAR2(30 CHAR) NOT NULL
,runcntlid VARCHAR2(30 CHAR) NOT NULL
,prcsprty  NUMBER NOT NULL
--------------------optional columns
,avg_duration NUMBER NOT NULL
,med_duration NUMBER NOT NULL
,max_duration NUMBER NOT NULL
,cum_duration NUMBER NOT NULL
,tot_duration NUMBER NOT NULL
,num_samples  NUMBER NOT NULL
) tablespace ptapp;

create unique index sysadm.ps_xx_gfcprcsprty
on sysadm.ps_xx_gfcprcsprty(prcstype, prcsname, oprid, runcntlid) 
tablespace psindex compress 3;

Trigger Before Insert into PSPRCSQUE

As processes are scheduled in PeopleSoft, a row is inserted into the process scheduler queue table PSPRCSQUE.  A trigger will be created on this table that fires after the insert.  It will look for a matching row on the metadata table, PS_XX_GFCPRCSPRTY for the combination of process type, process name, operator ID, and run control ID.  If found, the trigger will assign the specified priority to the process request.  Otherwise, it will take no action.

CREATE OR REPLACE TRIGGER sysadm.psprcsque_set_prcsprty
BEFORE INSERT ON sysadm.psprcsque
FOR EACH ROW
WHEN (new.prcsname = 'RPTBOOK')
DECLARE
  l_prcsprty NUMBER;
BEGIN
  SELECT prcsprty
  INTO   l_prcsprty
  FROM   ps_xx_gfcprcsprty
  WHERE  prcstype = :new.prcstype
  AND    prcsname = :new.prcsname
  AND    oprid = :new.oprid
  AND    runcntlid = :new.runcntlid;
 
  :new.prcsprty := l_prcsprty;
EXCEPTION
  WHEN no_data_found THEN NULL;
  WHEN others THEN NULL;
END;
/
show errors

In this case, I am only assigning priorities to RPTBOOK processes, so I have added a when clause to the trigger so that it only fires for RPTBOOK process requests.  This can either be changed for other processes or removed entirely.

Priority Metadata

How the priorities should be defined will depend on the specific use case.  In some cases, you may choose to create a set of metadata that remains unchanged.

In this case, the objective is that the processes to take the longest to run should be executed first.  Therefore, I decided that the priority of each nVision report book process (by operator ID and run control ID) will be determined by the median elapsed execution time in the last two months.  The priorities are allocated such that the sum of the median execution times for each priority will be as even as possible.  

I have created a PL/SQL procedure GFCPRCSPRIORITY to truncate the metadata table and then repopulate it using a query on the process scheduler table (although, an Application Engine program could have been written to do this instead).  Some details of that query will vary with the exact use case. The procedure is executed daily, thus providing a feedback loop so if the run time varies over time, or new processes are added to the batch, it will be reflected in the priorities.

REM nvision_prioritisation_by_cumulative_runtime.sql

set serveroutput on
create or replace procedure sysadm.gfcprcspriority as
  PRAGMA AUTONOMOUS_TRANSACTION; --to prevent truncate in this procedure affecting calling session
  l_hist INTEGER := 61 ; --consider nVision processes going back this many days
begin
  EXECUTE IMMEDIATE 'truncate table ps_xx_gfcprcsprty';

--populate priorty table with known nVision processes 
insert into ps_xx_gfcprcsprty
with r as (
select r.prcstype, r.prcsname, r.prcsinstance, r.oprid, r.runcntlid, r.runstatus, r.servernamerun
, CAST(r.rqstdttm AS DATE) rqstdttm
, CAST(r.begindttm AS DATE) begindttm
, CAST(r.enddttm AS DATE) enddttm
from t, psprcsrqst r
  inner join ps.psdbowner p on r.dbname = p.dbname -- in test exclude any history copied from another database
where r.prcstype like 'nVision%' --limit to nVision processes
and r.prcsname like 'RPTBOOK' -- limit to report books
and r.enddttm>r.begindttm  --it must have run to completion
and r.oprid IN('NVISION','NVISION2','NVISION3','NVISION4')  --limit to overnight batch operator IDs
and r.begindttm >= TRUNC(SYSDATE)+.5-l_hist --consider process going back l_hist days from midday today
and r.runstatus = '9' --limit to successful processes
and r.begindttm BETWEEN ROUND(r.begindttm)-5/24 AND ROUND(r.begindttm)+5/24 --started between 7pm and 5am
), x as (
select r.*, CEIL((enddttm-begindttm)*1440) duration -–rounded up to the next minute
from r
), y as (
select prcstype, prcsname, oprid, runcntlid
, AVG(duration) avg_duration
, MEDIAN(CEIL(duration)) med_duration 
, MAX(duration) max_duration
, SUM(CEIL(duration)) sum_duration
, COUNT(*) num_samples
from x
group by prcstype, prcsname, oprid, runcntlid
), z as (
select y.* 
, sum(med_duration) over (order by med_duration rows between unbounded preceding and current row) cum_duration 
, sum(med_duration) over () tot_duration
from y
)
select prcstype, prcsname, oprid, runcntlid 
, avg_duration, med_duration, max_duration, cum_duration, tot_duration, num_samples
--, CEIL(LEAST(tot_duration,cum_duration)/tot_duration*3)*4-3 prcsprty  --3 priorities
, CEIL(LEAST(tot_duration,cum_duration)/tot_duration*9) prcsprty  --9 priorities
--, DENSE_RANK() OVER (order by med_duration) prcsprty --unlimited priorities
from z
order by prcsprty, cum_duration;

  dbms_output.put_line(sql%rowcount||' rows inserted');
  commit;

end gfcprcspriority;
/
show errors
In testing, I found that using just the 3 delivered levels of priority was not sufficiently granular to prioritise the jobs adequately, so I chose to use 9 levels (1 to 9).  The process priority on PRCSQUE is not validated, so I can use any value.  I also found I could just rank the processes from 1 to n by duration, and that would also work.

It is possible to create additional priority levels for process categories (see also More Process Priority Levels for the Process Scheduler), but that still only works for prioritising different processes over each other.

Metadata

This is the metadata produced on a test system by the above query.  It will vary depending on what has been run recently, and how it performed.  There are more, shorter processes in the lower priority groups, and fewer, longer processes in the higher priority groups.  

PRCSTYPE                       PRCSNAME     OPRID        RUNCNTLID                        PRCSPRTY
------------------------------ ------------ ------------ ------------------------------ ----------
…
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_XXX1                          1
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_XXX3                          1
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_LLLL8                         1
…
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_LLLL9                         2
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_STAT8                       2
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_STAT1                       2
…
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_TEMPXX                        6
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_3                           6
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_17                          6
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_STAT4                       7
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_28                          7
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_INCXXX                        8
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_MORYYY1                       8
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_24                          9
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_16                          9

A Test Script

This test script inserts some dummy rows into PSPRCSQUE to check whether a priority is assigned by the trigger. The insert is then rolled back.

INSERT INTO psprcsque (prcsinstance, prcstype, prcsname, oprid, runcntlid)
VALUES (-42, 'nVision-ReportBook', 'RPTBOOK', 'NVISION','NVS_RPTBOOK_17');
INSERT INTO psprcsque (prcsinstance, prcstype, prcsname, oprid, runcntlid)
VALUES (-43, 'nVision-ReportBook', 'RPTBOOK', 'NVISION','NVS_RPTBOOK_STAT1');
select prcsinstance, prcstype, prcsname, oprid, runcntlid, prcsprty from psprcsque where prcsinstance IN(-42,-43);
rollback;

You can see that it was successful because priorities 2 and 7 were assigned.

PRCSINSTANCE PRCSTYPE                       PRCSNAME     OPRID        RUNCNTLID                        PRCSPRTY
------------ ------------------------------ ------------ ------------ ------------------------------ ----------
         -43 nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_STAT1                       2
         -42 nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_17                          7

Monitoring Script

This query in script process_prioritisation_by_cumulative_runtime_report.sql reports on the average, median, and cumulative median execution time for each nVision process that ran to success during the overnight processing window as calculated by the package GFCPRCSPRIORITY and stored in PS_XX_GFCPRCSPRTY.  It also compares that to the priority and last actual run time for that process.

Example Output

                                                                                                      Cum.                                            
                                                                        Average   Median            Median    Total         Last Run   Actual                   
                                                                  Prcs Duration Duration Duration Duration Duration     Num  Process Duration Duration Duration Priorty
PRCSTYPE             PRCSNAME   OPRID        RUNCNTLID            Prty   (mins)   (mins)   (mins)   (mins)   (mins) Samples Priority   (mins)    Diff    % Diff    Diff
-------------------- ---------- ------------ -------------------- ---- -------- -------- -------- -------- -------- ------- -------- -------- -------- -------- -------
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_4           9    90.65      131      209     1834     1997      23        6      189       58       44       3 
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_16          9   159.17      163      209     1997     1997      23        9      177       14        9       0

nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_14          8    89.26      127      215     1703     1997      23        6      167       40       31       2
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_24          8   115.87      117      165     1576     1997      23        9      144       27       23      -1

nVision-ReportBook   RPTBOOK    NVISION2     NVS_RPTBK_MORYYY1       7    93.13       85      165     1459     1997      23        8      158       73       86      -1
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_28          7    88.30       80      172     1374     1997      23        8      108       28       35      -1

nVision-ReportBook   RPTBOOK    NVISION2     NVS_RPTBK_INCXXX        6    83.61       79      149     1294     1997      18        7      118       39       49      -1
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_17          6    70.96       69      105     1143     1997      23        7       81       12       17      -1
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_STAT4       6    68.00       72       81     1215     1997       8        7       81        9       13      -1

nVision-ReportBook   RPTBOOK    NVISION2     NVS_RPTBK_MMMMMM        5    52.45       46      119      914     1997      22        5       91       46      100       0
nVision-ReportBook   RPTBOOK    NVISION2     NVS_RPTBK_TEMPXX        5    50.48       49      104      963     1997      23        5       94       45       92       0
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_3           5    55.52       55       99     1018     1997      23        6       79       24       44      -1
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_1           5    47.70       56      137     1074     1997      23        4       56        0        0       1
…

Monitoring Query

The query in prcsmap.sql is used to produce the data for a map of the processes, showing request time, time spent queuing, and time spent executing.  It is the basis of the second chart above. I normally run this in SQL Developer and export the data as an Excel workbook.  There is an example spreadsheet in the Github repository.

9 levels of Prioritisation

With prioritisation, we can see that the long-running jobs with higher priority ran earlier.  

We can also see that some of the higher-priority jobs that are scheduled later are running earlier than those scheduled earlier, and are thus finishing earlier.

There is no longer any tail of processing.  Instead, load drops quickly at the end of the batch, and the batch as a whole finishes earlier.










Thursday, April 27, 2023

Querying the PeopleSoft Message Log with SQL

It is easy to access the PeopleSoft message log in Process Monitor component, but it can be a little difficult to work with in SQL because it is stored in multiple tables.

This started when I wanted to generate a PeopleSoft log message as a single string of text, so I could investigate shared pool memory errors by searching for ORA-04031 errors.  Ultimately, the string 'ORA-04031' is stored in PS_MESSAGE_LOGPARM, but I wanted to see the whole error message.

  • Each process, has a request record on PSPRCSRQST, it can have many messages.
  • Each message is stored in the message log table PS_MESSAGE_LOG
  • The text of each message is stored in the message catalogue table PSMSGCATDEFN.  It can have up to 9 substitution strings (%1, %2, etc).
  • These correspond to up to 9 parameters stored on PS_MESSAGE_LOGPARM that are substituted into the message string.

I assemble the message text in a PL/SQL function exactly as PeopleTools programs do, substituting the variables in the message string from the message catalogue with the parameter values.  The PL/SQL function is put into the SQL query as a common table expression so that I don't have to create a function or package in the database.  The function returns the full message text in a CLOB, thus I can then easily manipulate the message string in SQL.

In this case, I wrote a SQL query to search for ORA-04031 (see psmsglogora4031.sql on Github), but the same PL/SQL function can be used in various queries.  

It can be slow to search the generated message string.  It can be faster to search PS_MESSAGE_LOGPARM directly. 

WITH FUNCTION psmsgtext(p_process_instance INTEGER, p_message_seq INTEGER) RETURN CLOB IS
  l_message_log ps_message_log%ROWTYPE;
  l_message_text CLOB;
BEGIN
  SELECT *
  INTO   l_message_log
  FROM   ps_message_log 
  WHERE  process_instance = p_process_instance
  AND    message_seq = p_message_seq;

  SELECT message_text
  INTO   l_message_text
  FROM   psmsgcatdefn
  WHERE  message_set_nbr = l_message_log.message_set_nbr
  AND    message_nbr     = l_message_log.message_nbr;

  --dbms_output.put_line(l_message_text);
  FOR i IN (
    SELECT *
    FROM   ps_message_logparm
    WHERE  process_instance = p_process_instance
    AND    message_seq = p_message_seq
    ORDER BY parm_seq
  ) LOOP
    --dbms_output.put_line(i.message_parm);
    l_message_text := REPLACE(l_message_text,'%'||i.parm_seq,i.message_parm);
  END LOOP;

  --and tidy up the unused replacements at the end
  RETURN REGEXP_REPLACE(l_message_text,'%[1-9]','');
END;
x as (
select r.prcstype, r.prcsname, r.oprid, r.runcntlid
, l.*, psmsgtext(l.process_instance, l.message_seq) message_text
from ps_message_log l
LEFT OUTER JOIN psprcsrqst r ON r.prcsinstance = l.process_instance
WHERE …
)
select *
from x
ORDER BY dttm_stamp_sec
/
Now, I can easily produce a report of messages, like this:


Process              Process         Operator                             Process  Msg                                   Msg         Msg
Type                 Name            ID         Run Control              Instance  Seq JOBID           PROGRAM_NAME     Set#  Msg#   Sev DTTM_STAMP_SEC
-------------------- --------------- ---------- ---------------------- ---------- ---- --------------- --------------- ----- ----- ----- ----------------------------
MESSAGE_TEXT
------------------------------------------------------------------------------------------------------------------------------------------------------
nVision-ReportBook   RPTBOOK         VP1        NVS_XXXXXXX_99            1234567    1 PRCS SCHDL      psprschd           65    70     0 01/04/2023 20.37.21
Process Request shows status of 'INITIATED' or 'PROCESSING' but no longer running

nVision-ReportBook   RPTBOOK         VP1        NVS_XXXXXXX_99            1234567    2 PRCS SCHDL      psprschd           65    73     0 01/04/2023 20.37.23
PSNT1 failed to post files to the report repository.  Server scheduled to try again on 2023-04-01-20.37.41.239539.  See log
...