Posted by Trond Stroemme
Intro
Parallel processing is not a new concept, but one that is regularly overlooked when it comes to increasing ABAP performance. Why?
Your SAP system will (normally) have more than one process available at any given time. Still, most of us insist on using just one of them. This is a bit like a manufacturer relying on only one truck to bring the products from his plant to the shopping malls, when there's a whole fleet of trucks just standing by!
Not only that, but most SAP systems spans more than one application server, each with a range of (hopefully) available processes. So, what are we waiting for?
h2. OK, so my program takes forever to execute. How can I put it on steroids?
In this blog, I'll show a practical example for dealing with one of the dreaded father-and-son relations in the wonderful world of SAP FI: Accounting documents, tables BKPF-BSEG. Prowling your way through these tables can really take its toll, both on the system itself and your patience. Not to mention that of the customer breathing down your neck.
I actually suggest you start there for some very interesting background info on the merits (and pitfalls) of using RFC's. There's also a link to Horst Keller's blog series and the official SAP documentation. In addition, the excellent book "ABAP Cookbook" from SAP Press (by James Wood) outlines the principles of using asynchronous RFCs for parallel processing (as well as providing loads of other cool stuff for enhancing your ABAP skill set!) A highly recommended read.
Using asynchronous RFC's without caution is not recommended, and you risk bogging down the system as well as running into errors that can be difficult to resolve. However, if you do decide to use parallel processing, the following might be a good starting point. I'll try to keep things simple and explain every step along the way.
h2. The sample program
We'll create a program to display info from BKPF/BSEG. The program will read all entries from BKPF (feel free to introduce your own selection criteria here, such as company code and/or fiscal year), and then retrieve all related entries from BSEG. This second step will be done by calling a function module via RFC, repeatedly, in parallel. We will try to balance the workload based on the number of documents in the tables, and the available processes on our SAP application servers.
Finally, we will examine the runtime analysis of the program and compare to a standard single process execution.
Finally, we will examine the runtime analysis of the program and compare to a standard single process execution.
The test program basically consists of the following steps:
- Call function module SPBT_INITIALIZE to find out how many available processes we can use
- Split the number of documents into handy packages and call an RFC a number of times in parallel
- Wait for the results from the called RFC's and merge them all back into the final report
The program is fairly straightforward. It reads BKPF, tries to split the retrieved BKPF entries into nice "packages" based on the key fields BUKRS and GJAHR. These are the two main parameters for our RFC-enabled function module - we're building range tables for them in order to facilitate our work. The idea is to pass these two as ranges to the RFC-enabled function reading BSEG, so that the number of documents passed to each call of the function is more or less consistent. Since the number of financial documents will vary with company codes and fiscal years, we cannot ensure a 100% even workload, but this is just an example.
Based on the available resources (number of processes for all application servers, which we find using function SPBT_INITIALIZE), we then start to kick off calls to the RFC-enabled function module. This is done a number of times in parallel, using the CALL FUNCTION STARTING NEW TASK... PERFORMING ... ON END OF TASK. By using this feature, we ensure that the calling program executes a specific form whenever the control is passed back from the RFC "bubble" to the calling program (common sense states you should use object orientation, and thus specify a method, but for our example I find the classical procedural program better for illustration purposes).
What happens when the aRFC finishes, is the following:
- Control is passed back to the calling program
- The form (or method) specified in the CALL TRANSACTION statement (addition PERFORMING ... ON END OF TASK) is called. This form enables you to retrieve any returning or exporting parameters from the called RFC, and use them in the main processing.
By splitting the workload into sizeable chunks, we can execute a multitude of workloads simultaneously, thereby reducing the total execution time to a fraction of the time traditionally used. In my example, I was able to run this report in less than 5% of the time it took running it in one single process.
The program and function module are presented below. I've done my best to insert comments in order to explain what's going on, and hope you can use this as a template.
The function module has been created as an RFC function (check the Remote-Enabled Module box on the Attributes tab). Besides this, there's nothing special about it.
*& Report ZTST_CALL_ASYNC_RFC *& &----
*& A small test program for calling function modules a number of times *& in parallel. *&*& The program calls the RFC-enabled function ZTST_READ_BSEG - which *& reads entries from table BSEG and calculates totals*& *& A log table, ZTST_RFC_LOG, is used for logging the progress, both *& from within this program and the function itself. *& *& The program will launch the function on all available app servers. &----
- The parameter P_PARA allows you to run in sequential mode for comparison purposes!
- Start by retrieving number of maximum and available processes
- Clear the log table of old entries
- Accumulate total number of BKPF per BUKRS and GJAHR
- Main loop - here, we loop at LT_STAT, which contains number of documents per company code
- and year (BUKRS and GJAHR). We use these figures to build ranges for BUKRS and GJAHR, which
- are then used when calling the RFC function module
- If you do not need to programmatically calculate the number of entries for each RFC call,
- for instance when processing a table sequentially, you can replace this logic with something simpler,
- a.k.a. Loop at table, call RFC for each 1000 entries.
- Means we have previous entries in ranges, but < 1500 total (including the current record).
- We add current record to previous ranges and run all of them.
- Now, run RFC for BUKRS/GJAHR current record (which has more than 1000 in count)
- That's it! We've called our RFC a number of times (hopefully more than one).
- Now, all that remains is to wait until all RFC's have finished.
- Write the contents of TASKLIST to show which servers were used and how things went...
- Booooring... sequential mode - for performance comparison only (try runtime analysis on each logic)
- Final touch: print the results of all our efforts.
*& Form initial_selection &----
- text
form initial_selection.
*& Form call_rfc &----
- text
form call_rfc.
- Note that it might not be a good idea to use ALL free processes, such as here.
- Doing so might cause minor inconveniences for other users, or nasty phone calls
- from Basis....
- A better idea would be to reduce lv_free_processes by, say, 5 before starting.
- Note that we are not using the IMPORTING parameter here;
- instead it's used when doing RECEIVE RESULT in form RECEIVE_RESULTS_FROM_RFC
- Retrieve the name of the server
- This could mean an app server is unavailable; no real need to handle this situation in most cases.
- (Subsequent calls to the same server will fail, but the FM should run nicely on all available servers).
*& Form RECEIVE_RESULTS_FROM_RFC &----
- Called when we return from the aRFC.
- -->VALUE text
- -->(P_TASKNAME) text
form receive_results_from_rfc using value(p_taskname).
- Note: WRITE statements will not work in this form!
- Update the TASKLIST table, which is used for logging
- Receive the results from the RFC
- Loop at partial results; include in our totals table
""Local Interface: *" IMPORTING *" VALUE(IM_TASKNAME) TYPE NUMC4 *" VALUE(IM_BUKRS) TYPE ZTST_BUKRS_RANGE_TT *" VALUE(IM_GJAHR) TYPE ZTST_GJAHR_RANGE_TT *" EXPORTING*" VALUE(RE_RESULTS) TYPE ZTST_BSEG_RESULTS_TT *" CHANGING*" VALUE(CH_NETWR) TYPE NETWR_AP OPTIONAL *"----
- This is a function module used by program ZTST_CALL_ASYNC_RFC
- for demo purposes. The idea is to show how to take long-processing
- programs and split them up into parallel processes, by calling
- RFC's asynchronously. The calling program will call this function
- module a number of times, then collect the results and process them.
Checking the system load
During program run, you can check your RFC's with transaction SM66, which shows all processes across the application servers of your system.
h2. Final word: run time analysis
Try using the run time analysis on the program, both when selecting parallel mode and when un-checking P_PARA (which causes a normal select within the main program). The runtime analysis won't show the additional load of the RFC modules running in parallel, but the total program execution time is far lower - and this, after all, is the main point of splitting a workload into separate parallel tasks.
Running in sequential mode (no parallel RFC's):
!https://weblogs.sdn.sap.com/weblogs/images/252050646/img5.GIF|height=275|alt=Sequentialrun|width=617|src=https://weblogs.sdn.sap.com/weblogs/images/252050646/img5.GIF!
Running in parallel mode:
!https://weblogs.sdn.sap.com/weblogs/images/252050646/img6.GIF|height=199|alt=Withparallel processing|width=626|src=https://weblogs.sdn.sap.com/weblogs/images/252050646/img6.GIF!
As you can see, the run time is dramatically reduced. Total system load may amount to the same (and should actually be slightly higher, with the overhead of the separate RFC's), but it's the total execution time that counts. Here, the execution time is roughly 10% when using asynchronous RFC's as compared to a classical "one-in-all" process.
The above tests were run in a system with approximately 35.000 entries in BKPF, and 100.000 in BSEG.
h2. Words of warning (again)A few words on transactional RFC's: There are situations when you cannot use this technique. Commits cannot be performed inside an RFC - this would conflict with the session in which the main program is running. You can find more info on these topics by checking SAp help for RFC programming. However, for the larger part of processor-intensive developments, it is a technique that is sadly overlooked. I recommend everyone to give it a try, provided they follow the guidelines provided by SAP on the topic.
In addition to the blogs and resources mentioned at the start, the following is worth checking out:
No comments:
Post a Comment