Working with Legacy Code – ABAP Style
Now is the table of our contents, made glorious summer by this Son of York.
Prologue
What is Legacy Code?
Why is Legacy Code “bad”?
If we do think it’s bad, what can we do about this?
What can you do about lack of clarity??
What can you do about all the dependencies??
Epilogue – Procedural Tests
Epilogue – OO Tests
Prologue
What I have been doing since I got back from Germany to Australia is reading famous academic articles which all “proper” people who studied computer science probably read 20 years ago, and see how I can take the concepts and apply them to the world of ABAP. I also try to take a different approach to the other blogs I see on SDN, so as not to repeat what other people say. Some people get puzzled as to why my blogs are also ten times as long as anyone else’s, but that’s just me. I know people nowadays are only supposed to have a ten second attention span, but I suspect a lot of people are capable of much more.
So let’s get going. “Working with Legacy Code” is all about taking an existing program which is “bad” and making it “better”. So, why is it bad in the first place, and how does making it better help us?
The seminal work on this subject was written by Michael Feathers back in 2004. You would think that nine years later the world would have moved on but I am willing to bet there are hundreds of SAP programmers out there who this very day during their work hours wrote heaps of “legacy code” according to his definition.
What is Legacy Code?
What then, is legacy code? Probably not what you would think at first glance. Let us look at some possible definitions
My Definition My Definition Is This.
- Poltergeist
According to Wikipedia “Founded in England (where the Ruling House resides in London) in the 6th century, the Legacy was established to collect dangerous and ancient knowledge and artefact’s, solve paranormal problems, and protect humanity from supernatural evils.”.
That doesn’t sound right. Most likely very little of the SAP code written this day could be deemed as supernaturally evil, at least I would hope so, though I will give you an example in a minute.
- Non-SAP Code
Before reading the article that was what I would have guessed as the answer. Right from the word go when I started working on SAP implementations in 1997, the term “legacy” meant the system that SAP was replacing, and thus by definition a Bad Thing, no matter how much the current user base liked it. Once again everything written in the old system must be supernaturally evil, so is this what we mean by legacy code?
If you think about it that can’t be right - we can’t be “working with legacy code” if the system with the code in it no longer exists due to being replaced by SAP.
- Procedural Code
There are articles without number on the SDN and in magazines like SAP Insider which state in no uncertain terms that the code that is in fact supernaturally evil is procedural code, and that is given as the prime reason for swapping over to object orientated code, as opposed to logical reasons like “it helps you because…”. This is why I have been writing all these blogs about switching over to OO programming, to give myself some concrete reasons why it is better.
In fact no, not even procedural code is Legacy Code. Even the OO evangelists admit that OO programs designed badly can be worse than procedural programs. There is so much more scope to do things badly in the OO environment and if you give people enough OO programming rope most of them seem to hang themselves. My first few attempts at OO programs certainly self destructed.
- Non-Testable Code
That is the definition. Legacy Code is code that cannot be tested.
Why is Legacy Code “bad”?
“Code without tests is bad code. It doesn't matter how well written it is; it doesn't matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behaviour of our code quickly and verifiably. Without them, we really don't know if our code is getting better or worse.”
― Michael Feathers, Working Effectively with Legacy Code
As you can see by that definition if a procedural program has unit tests and an OO one does not, then it is the OO program that is legacy code. That is why SAP did not restrict the ABAP Unit framework to just classes and methods. My very first use of ABAP Unit was to add tests to an existing procedural program, and the benefit was enormous. I come back to this at the end of the blog.
Another similar quote from Michael Feathers about tests, just to ram home the point:-
“Most of the fear involved in making changes to large code bases is fear of introducing subtle bugs; fear of changing things inadvertently. With tests, you can make things better with impunity. To me, the difference is so critical, it overwhelms any other distinction.”
So, since 99% of development work is changing existing programs, and the biggest problem with changing existing programs is breaking something unrelated to the change.
If we do think it’s bad, what can we do about this?
Answer – “Add Tests”.
OK, we want to add loads of tests to our programs to “cure” them. Let us see:-
· Why we cannot add tests
· Why we would like to add tests
· What to do to change things so we CAN add tests
OO everywhere man
On the Internet a lot of wishful thinkers say that nobody in the SAP space does procedural programming any more and has not since the advent of version 4.6. My gut feeling, and observation at assorted companies, tells me however that there is not as much custom OO code about as you might think.
Even SAP, at first seemed to think that if you wrapped a class around a function module then suddenly everything was magically object orientated. I have seen a lot of standard classes like that in the system.
Monkey see, monkey do, so a lot of SAP customers did the exact same thing.I have seen lots of Z classes with all static methods, everything public, including the data. That is sort of missing the point.
Unbelievably some people have not even yet made the jump to procedural programming. I saw one program – written in 2010 – a report with lots of lines of code, using WRITE statements, with no subroutines at all, just the good old SELECTION-SCREEN followed by everything in one huge mass after START-OF-SELECTION.
Get to the point
All right, the point is, that IT IS generally easier to add tests to OO programs than procedural ones, and all the documentation and helpful articles you find on the internet – even going back ten years or more – presume you are starting with an OO program in the first place that you want to refactor. That is because they were all written with Java or C++ or whatever in mind, but now I see in the SAP space similar articles all once again starting with the premise that the starting point is an OO program with a bad design and no tests.
I put it to you, learned counsel for the defence, thatmany people are in a boat like me, where you constantly have to add improvements and fixes to monolithic procedural programs over ten years old, which are vital to the day to day running of the business. I imagine SAP employees feel the same way about SAPMV45A.
Big Bang Theory
If you attempt to rewrite the whole thing in one Big Bang, you will win no popularity contests, take forever, break the thing thoroughly and bring down the company, and when you are standing in the dole queue it is cold comfort that you were doing the “right” thing.
So, it’s impossible? The nature of such monolithic programs means you just cannot add unit tests as the various aspects of the program – user interface, business logic, database access, external interfaces etc. are all so hopelessly welded together.
This is somewhat reminiscent of the urban legend that says when you ask someone in Ireland for directions they say “you can’t get there from here”.
To counteract that, let’s have a Star Trek quote why don’t we. I love them - they re-enforce the nerdy image of us programmers.
“’You can’t get past light speed without getting to light speed first … can’t be reached. Can’t.Be. Done.’
‘Yet you did it Mr.Cochrane’”
That was from “Federation” by Judith & Garfield Reeves-Stevens, about solving another impossible problem.
Is this real, or just a dream – there’s nothing that is in between
As I mentioned this is not an academic exercise for me, this is something I have to deal with every working day, and often some of the things I have to do to make the required change make me want to cry, they work, and take a very small amount of time, but are so fundamentally wrong it beggars belief.
Today for example, after a change – that worked perfectly by the way and took no time at all – I commented my change thus:-
“I have just changed the above global variable in this subroutine, off it goes into the wild blue yonder, in many subroutines time it will get exported into a custom Z table, a table which exists for the sole purpose of storing data until this program SUBMITS another program to read back that same data from the Z table one second after it has been written”
This is all wrong on so many levels. It works NOW but what if one day someone makes a change to one of the many subroutines between me writing the value and it getting exported to the table? Or someone changes the program that gets submitted? There is so much scope for disaster in the future.
As mentioned earlier, the assumption is you can’t instantly rewrite the program to make it a bit more sensible, the real problem of course, the whole focus of this blog, is if someone does make such a change they will have no way at all of knowing they had stuffed up my fix. How could they? Their new change works, so they are happy bunny.
The only way they could know they had an adverse effect is if there were unit tests they could run to see if any existing functionality was compromised.
Uphill Battle
So, I can make a really fast change that works, and leave the program in a potentially unstable state, or I code take some more time and add some tests (if at all possible).
I was reading a blog where someone was talking about just this subject and advocating refactoring code and adding tests and got this response:-
“Anonymous said...
one of the sure signs of a programmer...not a mention of costs
and time to completion
when you hire a painter to redo the kids bedrooms, you don't
expect them to "refactor" a hole in the back wall "to make it
easier to repaint the rooms when they are teenagers"
you're familiar with "don't reinvent the wheel" but how about
"don't redo tests"???”
This is what we are up against – doing something properly does take time and money and in a world where most companies are only allowed to look three months ahead and, say, sack their USA head of sales of not meeting quarterly targets, the long term view does not get a look in, despite the fact that, in the example above, repairing the hole in the wall now might stop the house falling down in a few years and that cost saving will offset the current cost of hole repairing by a factor of a thousand.
Walk Don’t Run
So, I am going to assume a situation where you have such a monolithic application where you have to change it all the time, and you can’t do a wholesale rewrite because it is just too difficult and risky. However one tiny bit at a time, you can start to change things so that when you do try and re-organise things the change is LESS difficult and risky.
Staying with the definition the reason this program is legacy code is that there are no tests and we can’t add tests. So, if I am adding a feature or fixing a bug why can’t I just add a test there and then and eventually there will be tests for everything? It’s not quite as easy as that. I will go over two main categories of reasons why you can’t just pop in a test.
1. You have no idea how in the world the code works. It just does like black magic. It is difficult to pin down what parts of the program to test. I call this category CLARITY. This is often 75% of the battle, and is going to be 90% of this blog.
2. It is technically impossible to do a unit test on the function being added or fixed because there are DEPENDENCIES.
What can you do about lack of clarity??
Clarity Begins at Home
Sometimes you have to debug for ages through a complicated program to try and work out just where in the world an error is coming from. Eventually you fathom what is going on, fix the problem and all is well. Then six months later you have to come back and fix another problem and you can’t remember how this monster works as it is so illogical.
As a favour to yourself, and to your colleagues current and future, you could do what Robert C..Martin calls “The Boy Scout Rule” on the area of code you have just fixed or enhanced.
He says:-
“It's not enough to write the code well. The code has to be kept clean over time. We've all seen code rot and degrade as time passes. So we must take an active role in preventing this degradation.
The Boy Scouts of America have a simple rule that we can apply to our profession.
· Leave the campground cleaner than you found it.
If we all checked-in our code a little cleaner than when we checked it out, the code simply could not rot. The cleanupdoesn't have to be something big. Change one variable name for the better, break up one function that's a little too large, eliminate one small bit of duplication, clean up one composite if statement.
Can you imagine working on a project where the code simply got better as time passed? Do you believe that any other option is professional? Indeed, isn't continuous improvement an intrinsic part of professionalism?”
Here is a link to the advert for his book on the subject, with some other good quotes, just keep going back and forwards.
http://www.informit.com/articles/article.aspx?p=1235624&seqNum=5
When I showed this to one of my colleagues he responded thus:-
“The boy scouts also have a saying “Be prepared”.
In the case of (other external contractors) and myself, that might mean Be Prepared for (the CIO) to march us out the door if we introduce a bug in an area that is not in scope that we are meant to be working on.”
I see the argument – any change you make no matter how trivial has a chance of causing a bug, and when the finger pointing starts you get asked “why EXACTLY did you change that? Was that change in the specification?”
As an example one of my colleagues decided to change a commented out section of open SQL commands (i.e. where you access the database directly instead of using the normal SELECT) from having an asterisk at the start to having the quotation mark at the start. Little did we realise that when you “comment out” such open SQL commands with a “” you are not commenting it out at all, it tries to execute. Who would have thought? He was trying to make the code clearer and – oh dear!
However as might be imagined, I am on the side of the cleaner uppers, so let us look at some areas where we can improve the clarity of an existing program one tiny bit at a time.
DRY
That stands for “don’t repeat yourself”. This very day one of my colleagues was tasked with doing a final purge of all custom code that tried to call transaction ME23 as opposed to ME23N. He noticed I had the monolithic monster of which I speak locked, as I usually do, and he said while I was there can a change the CALL TRANSACTION in a certain subroutine from ME23 to ME23N. The work of seconds.
After I told him I was done he said “oh I forgot, I also found it in the same program in subroutine XYZ as well”. Off I go to that subroutine, and to my puzzlement I find it exactly the same as the first subroutine I changed, clearly all of the code, dozens of lines, had been cut and pasted from one place to another, in the same program, resulting in two identical (large) subroutines concerned with drilling into assorted transactions. Presumably each one gets called half of the time by different areas of the program.
The obvious consequence is that they will diverge over time. – in this case if my colleague had not told me about the second subroutine then we would have ended up with a program that drilled into ME23N and sometimes ME23, seemingly at random from the users point of view.
The answer is this example is obvious – the subroutines were 100% identical, so delete one, then run the syntax check and I see where it gets called and replace the call with one to the remaining subroutine.
Far more common are blocks of code in the middle of a subroutine that have been cut and pasted, sometimes the block is identical, sometimes one value in one line out of thirty has been changed. Both cases are crying out to be encapsulated, the latter with a parameter for the varying bit.
As can be imagined this applies even more to cutting and pasting large chunks of code between different programs. It is so easy, and so fast, and the boss wants this new thing working TOMORROW so the temptation is overwhelming.
The mathematical equation here is does the extra time it takes you now to encapsulate the block of code outweigh the fact that next time you need this in another program ( and if you need it again once, then you’ll need it again sometime down the track) you just call the Z function ( or hopefully class) and you know it works as it works somewhere else, and more importantly any negative consequences if the multiple versions of the same code in your system fall out of synch.
As another real example, I got an email from a colleague once saying “you recall when we improved the XYZ logic? It turns out we changed four programs where the logic was, but missed program ABC and the salesmen are up in arms as it is price rise time next week”. If the code had been in just one place that literally could not have happened. Needless to say, it is in one place now.
When I find myself solving the same problem twice with code, I now find it physically impossible to just cut and paste the old code in. Maybe I am lucky in having an understanding boss.
The ultimate example is when you find yourself solving the same problem not just two or three times, but hundreds of times, in almost every program you encounter. The example I would like to share is that I find myself spending an inordinate amount of time getting the names of things like sales organisations, divisions, vendors, materials, equipment types etc. To make life fun for you SAP has no consistency at all, some names are in text tables, some are in a fixed value list in a domain, some like vendors are in the main table, as many different ways to store the name as there are stars in the sky.
As the years go by I find I memorise which tables to read for various data element key values, but I sometimes forget and have to go looking again, and it’s not always obvious. In the same spirit of doing a favour to myself I thought I’ll fix this once and for all.
Legacy Figure 01
I did a once off exercise of building in a complex algorithm which did a runtime identification of the data element passed in and then searched for text tables, and then a domain value list to get the text name. If that did not work I subclassed the complicated ones like vendors and materials, not that the calling program knows this. I won’t go into detail on this now, but it is the principle of the thing.
I never have to worry about going hunting for the correct text table again. That is my ultimate example of a bit of work up front for a large saving later.
Life in Six Words
It was said that Shakespeare summed up human life in six words – “Bloom, Bloom, Bloom, Rot, Rot, Rot”.
Are our programs like that? We create things of beauty and then they gradually rot away over time as different people graft extra things onto them and fill them with workarounds and hacks until they are incomprehensible and unchangeable? Unchangeable in this context means one change breaks six existing things, and fixing those six break thirty six other things, which has been compared to the Hydra where you cut one head off and six more grow back.
One of my favourite examples of rotting code is the ever popular commenting out of the code you have just replaced. The first time this makes perfect sense, you think “when someone sees this, they get a clear picture of the before and after” usually without any explanation of WHY the code was replaced.
In no time at all, you get a huge sea of blocks of commented out code with little islands of live code in the middle feeling really lonely looking round wondering where all their friends are. I have seen this again and again. There is no way that makes things easier to understand, you have to page up and down like a lunatic to try and get a grip on what is going on.
When I first saw this I presumed the programmers had no idea there was a version management system for ABAP programs which gave you a complete history of changes. As time goes on I am forced to conclude people do this because “we’ve always done it this way”.
In the same way, people look at what OSS note code changes look like and do the same themselves:-
* IF SOMETHING THEN DO_SOMETHING “Remove - Ticket 765438765
IF SOMETHING OR SOMETHING ELSE THEN DO SOMETHING “Replace - Ticket 765438765
Leaving aside the line of dead code that has arrived to turn into a bloated corpse and spread bubonic plague around the program, this putting ten digit numbers as comments trick happens often and people are pleased as punch and say “look I have given a reference number so if someone really wants to know why that change was made then they can go off and look it up”. True enough, unless of course the comments were written by your consultants on the implementation project referring to THEIR helpdesk system, and they have been gone ten years, but leaving aside how easy it is to wade through the change request system, I can think of two better ways to achieve the same thing straight off.
1. If it is that important that someone looking at the code should know why it changed then, in my lunatic world at least, maybe you could TELL THEM. Right there and then. Maybe, and again I could be the fool here, you should not drop obscure hints in a secret code to make them think they are in a Dan Brown novel as opposed to writing a comment saying why you changed it
2. Even better, if the code itself has variables with meaningful names and what have you, then using the version comparison then, hopefully, it is obvious at first glance why it changed. That only works if the code you removed had meaningful names as well, otherwise you do need a comment.
It is important to say why you changed something, especially if you are doing something odd to avoid a specific problem, otherwise some well-meaning person in the future will say “that’s odd” and change it right back again.
This very day I changed a line of code which said:-
M_FLAG = ‘X”. “Ticket 530333
And I changed it to…
M_FLAG = ‘X’. “M_FLAG means such and such, as we set it here because of such and such etc…
I once did some counter-intuitive logic I a complex database SELECT to avoid the wrong index being chosen, and years later, from the safety of another country, when I looked at my code in the system I used to work in I found a consultant had changed things back and written:-
* Above Code is very strange - cannot be understood without any background
* andfunctional specification. Therefore try end error to solve customer
* problem(cannot exclude any side effects):
He was right about the side effects. Since he was an advocate of leaving dead code in the program I can clearly see the problem I tried to avoid in the first place coming back and then about fifteen different ( to be precise - seventeen ) attempts to solve it before deciding it was impossible.
Ian Dury and the Code Block Heads
In the book by “Uncle Bob” I was talking about earlier he makes the observation that programmers are authors, insomuch as after you write something at some stage in the future someone else (maybe even yourself) will come back and look at it, to make a change or just to understand what on Earth is going on.
He did some sort of screen capture of a programmer at work, and found that 90% of the time was spent paging up or down in the same code block trying to get an overview of what was happening. The logical conclusion was to have each routine no bigger than one page and – hey presto – you can program ten time faster! Hooray! Hang out the flags!
Do you know, amazing as it may seem, there are some people who when confronted with that argument express doubts. Who would have thought?
Anyway, there is no smoke without fire, and in fact one the “ABAP programming guidelines” in the SAP Press book of the same name does in fact advocate having every subroutine (‘method see section such and such’ as the book says every two lines throughout its entire length) no bigger than one page. Like all guidelines, that is most likely something to be aimed for rather than something that is actually possible as a matter of course.
Going back to 1981 when I first started programming you had to make the program as terse as you possibly could, as I only had 1K to work with. Clearly your first experience with something sticks with you in some sense for the rest of your life, as thirty years on I still cannot stand to have extra lines when they can be avoided.
For example I cannot stop myself changing:-
LOOP AT INTERNAL_TABLE .
IF SOMETHING.
GET_SOME_DATA USING INTERNAL_TABLE.
IF SOMETHING_ELSE.
GET_SOME_MORE_DATA USING INTERNAL_TABLE.
IF SOMETHING_ELSE_YET_AGAIN.
DO_SOMETHING.
ENDIF.
ENDIF.
ENDIF.
ENDLOOP.
….into…
LOOP AT INTERNAL_TABLE WHERE SOMETHING = TRUE.
GET_SOME_DATA USING INTERNAL_TABLE.
CHECK SOMETHING_ELSE = TRUE.
GET_SOME_MORE_DATA USING INTERNAL_TABLE.
CHECK SOMETHING_ELSE_YET_AGAIN = TRUE.
DO_SOMETHING.
ENDLOOP.
The two routines are functionally identical, it is just that the second one has fewer lines, so if having to page up and down less is important, and I actually tend to think it is, then you have enabled to casual viewer to see more of the routine (method, see section such and such, can you imagine how annoying that gets after the third time, let alone the hundredth?) on one screen.
In the programming guidelines book of which I speak one of the rules is to restrict control blocks like IF statements to a nesting level of five. Very sensible advice, I have seen, time and again, enormous IF statement blocks, deeply nested so much that when you do a “pretty print” the part in the middle gets indented right to the far right of the screen.
In such situations I can’t stop myself encapsulating the sections between IF and ENDIF into their own routine. This can change things that are literally impossible to understand into the blindingly obvious.
Here is an example that makes me want to cry, before I got to it there were not even any comments after each ENDIF:-
Legacy Figure 02
Here is another example where I have tinkered around a bit more:-
Legacy Figure 03
The above used to be one thousand lines of code. I got heartily sick of pressing page down loads of time to try and find the part that dealt with deletion.
That’s not my name, that’s not my name, that’s not my name…
One thing I have repeated in various blogs, over and again, is about having variable and routine names that don’t mean anything, often in the form of numbers e.g. routine GX09876. You may think no-one would call a routine that, so I’ll leave it to you – did I make that example up or is it a real one I actually saw?
Rather than repeat myself I’ll direct the interested to point 10 of my previous blog:-
This is one area where I have changed my behaviour. In the days of the ZX81 variables were called “X”, and in fact in a lot of the Java examples on OO programming many of the variables names there seem to follow the same convention.
I will just give two examples, one from the week just gone which was driving me up the wall.
In my monolithic program I am always changing (slowly)itis stuffed with global variables, and I wanted to find where some global data got set. It took me ages because there are about a hundred routines where data of different sorts gets set and they are evenly split between routines with the following prefixes:- GET_, SET_, INIT_, POPULATE_, DETERMINE_, so I have to search through five different areas of the subroutine list to try and find what I want.
I imagine some people would be horrified by using GET_ to actually set the value of something!
I also had to struggle for days working out how something worked, I got it down to two routines which were called:-
GET_DELETE_USER_SELECTION
CHECK_DELETE_USER_SELECTION
This was all about deleting various things which the user had chosen, so far so good. However the routine starting with “GET_” performs assorted checks on what the user has selected, and the routine starting with “CHECK_” actually does the deletions. Eventually I found another routine starting with “DETERMINE_” which actually asks the user what they want to delete.
I would love to hear from the ABAP community with examples of the worst named routines and variables they have encountered…..
Signature Tune
Legacy Figure 04
All OO people will tell you that global variables are the work of the devil and you should get rid of them at all costs. One of my colleagues tells me he had a job interview with two interviewers and they told him one of them was strongly for global variables, one was strongly against. How to handle that one?
Let’s try to be objective for a second, and have a look at the pros and cons of global variables.
Pros - at least what people would SAY are the advantages.
· It’s easy, you don’t have to keep passing values back and forth via routine parameters, you write your program a lot faster
· Selection-Screens and DYNPROS are designed to work with global variables
· You don’t end up with routines with gigantic signatures which could be deemed as confusing
· We’ve always done it this way (a very common argument)
Cons
· It makes unit testing a lot more difficult, you cannot guarantee the order unit tests are run in and so you have to make sure all global variables are set to the “right” values at the start of each test method
· Without the variables being changed actually being in the signature you can end up with badly named methods which change SOMETHING but you don’t know what. I’ll come to this in a minute.
I tend to agree with minimising global variables – the premise at the very start of this blog that having unit tests is the most important thing in the universe would point you towards avoiding things that make such tests impossible, but you can take things to extremes – the example I would give is SELECTION-SCREENS – do you really have to go to the trouble of moving all the global variables into data inside local classes and then pretend the global variables (for such are selection-options ) do not exist. Maybe you do. Who am I to say. I can see the merit of moving them into re-usable Z classes.
Anyway, in my monolithic example there are so many global variables, trying to get rid of them wholesale is not only difficult but incredibly risky for a program vital to the day to day running of the business, so I have to live with them. The biggest risk you have is getting rid of a global variable only to find it’s used on a screen, and the syntax check doesn’t always warn you about that.
So, I find myself looking at a bunch of routines, all of which change the values of some global variables. Changing the names of the routines so they describe what they do goes some way to defeating the enemy (incomprehension) but only so far. I thought the next step might be to change the signatures of the routines so when you look at them you see what is getting changed.
Before:-
Legacy Figure 05
In the above the name of the routine with IDOC in made sense when we used to send an IDOC, but we moved from sending an IDOC to using PI years ago, so the routine name makes no sense at all. This is an example of comments and routine names “rotting” when all someone does is change the code to fix the current problem at hand and does not look at the surrounding code / comments.
After:-
Legacy Figure 06
Now we have a better idea of what the subroutine is expecting as input, and what data it returns. This also makes the individual subroutines more capable of being re-used, and paves the way for the eventual retirement of the global variables in question.
On the earlier matter of routines with huge signatures, an OO person would say if your method does have an enormous signature it is most likely doing too much.
Jumping round the program like a lunatic
Recently I have been translating some “pseudo-code” written by an older gentleman who most have been familiar with BASIC as I see GOTO statements here and there. I used to write in that myself and I can see the argument that GOTO 210 is not a very meaningful statement.
If we are trying to be writing a program which is easy to understand, then it could be said that understanding the flow is important. Maybe here is where your bog standard procedural program without screens shines – it is usually crystal clear in what order each procedure is called. A programmer could go bananas if they wanted and jump around illogically within procedures but generally the ones I have seen are usually fairly easy to follow.
Once you bring DYNPRO screens into the equation all bets are off. On a screen, after user input, the system process the AT EXIT-COMMAND, ON-USER_COMMAND, and to make things fun, the EVENT AT-USER_COMMAND is also triggered, so in some programs I have seen half of the user command processing was handled by a module in the PAI section, and half in the EVENT block responding to AT-USER_COMMAND.
In my example program there are 32 different screens, all of which are pop up boxes firing from user commands from the main ALV screen. A lot of older programs like this use the SUPPRESS DIALOG method for dialog boxes and then WRITE out a list of some sort on the screen, thus rendering the PAI modules useless, as they get triggered at once. The more modern way of doing things is to have an ALV grid on the screen instead, and then of course things become even more obscure – or do they?
Many years ago when I went on an ABAP OO course here in Australia the instructor said “the problem, of course, with OO code, is that you can’t tell what it does from reading it”. This was an official SAP training course I would add.
I hope that’s not true because what I have just been talking about is that if 99% of development effort is maintenance then it is VITAL that you can tell what a program does by looking at the code.
Well, ladies and gentlemen of the jury – is this true? Is the screen flow of a DYNPRO screen easier to follow than firing off loosely coupled events that could be handled anywhere? I don’t know the answer but I do know that debugging what happens after pressing a command in VA01 is easier to follow than the equivalent in ME21N. Or is that just because I am an OO novice? One moan I get repeated to me all the time by die hard procedural types is that when looking at OO code the forward navigation you get in ABAP when clicking on a procedure call is just not there with the largely dynamic calls you find in OO world.
Istanbul not Constant
One thing I mention again and again in my blogs are the so called “magic numbers” that pop up in programs. If we are talking about how to make things less confusing for the reader and at the same time easier to change as business requirements change then I am going to have do the broken record thing and bring this up again (I wonder if anyone below a certain age has the slightest idea what the term “broken record” originally referred to?).
As an example the other day in a specification I was given which had TONS of business logic within it, at one point I was supposed to multiply the result by one point six four five. Easy enough to do, but I thought to myself what is so wonderful about that number and more to the point, can it change?
A quick hunt all around the internet came back with the answer that one point six four five was the “famous” such and such’s constant, which he had come up with via an exhaustive serious of scientific experiments proving that the result is wrong unless you multiply it by this number. All well and good, so if it is a scientific constant like PI it can’t change can it? A-ha! Every country has a different interpretation of this result so this “constant” has a different value in calculations done in India than in Australia for example.
If it had been a universal constant then I would have declared it as a constant with the correct scientific name, and probably given a little explanation of what this was all about as a comment. In this case it varies from country to country so the value needs to be read out of some sort of customising table, but it still needs a meaningful name for the variable.
Immutables in Mega City One
A top tip that Horst Keller from SAP gives in his book is that when you do have a proper immutable constant, don’t declare it as a local constant but as a global one. That way if you have fifty four instances of the program running in different user sessions, the ABAP runtime system uses black magic to only have the value of the constant existing in one place in shared memory and all the user sessions read the value from there. The amount of memory this saves is miniscule, but as I mentioned much earlier I love making microscopic savings like this.
You could even have a global class just full of (related) constants – I see a lot of standard SAP classes like that. This is for when such a constant needs to be read by more than one program.
Constantly Amazed
I could not leave any discussion of constants without relating the habit some consultants used to have of if they were going to add ten to a value (say to determine the next line item of a sales order) they would declare a global constant called C_010. What value does that add?
Then of course, five years down the track the value we want to add changes to five and then they think “good job this is a constant, I only need to make the change in one place” and thus you end up with a constant called C_010 which has a value of five. I don’t think I need to expand.
The other thing which made me bang my head on the desk was that when confronted with an obscure code that SAP uses to denote a piece of equipment flagged for deletion – something like ISRU – then instead of declaring a constant called C_FLAGGED_FOR_DELETION they decided to call the constant C_ISRU. Well, thanks for that, everything is much clearer now. Of course, one day SAP might decide to change the code to JKML, and then you have one constant with a meaningless name pointing to a different meaningless value.
Enigma Code
So, by doing assorted cosmetic changes and minor restricting over a protracted period of time, hopefully you end up with something that at least you can understand how it works, even if everything is still 100% procedural.
Aiming lower, maybe now you have PART of a big program where it is clear how it works. If so, we are now half way to where all the other articles on this subject start.
What can you do about all the dependencies??
This state of DEPENDENCY will be
This next part has been written about at great length, including by myself, so I am just going to cover the basic idea, and give links to some proper articles about it.
In essence you want to do a test for your productive code but you can’t because the productive code reads from the database or writes to the database or accesses some external system or sends an email or needs some input from a user or one of a million other things you cannot or do not want to in the development environment.
Traditionally in an ABAP program if you want some information from the database you do a SELECT statement right there and then at the point you need it. Likewise updating a custom table, or maybe calling a PI proxy. Why in the world not? It’s so easy!
Well the reason why not is that then you can’t test the routine in question which does any of those sorts of things.
I am going to make up an example, based on real life, but changed a touch.
In this example my monolithic SAP program schedules feeding time at the zoo and makes sure all the animals get the right amount and type of food. Everything works fine. The keepers have some sort of mobile device which they can query to see what animals need feeding next and there is some sort of feedback they have to give after feeding the animals, and we want to keep track of the food inventory and the list goes on, none of the details are important.
All is well until one day we get two Pandas on loan from China, and the programmer has to make some changes to accommodate their specific Panda type needs (e.g. Bamboo Shoots) without stuffing up all the other animals. The programmer can do a unit test on the Panda specific changes they have made, but how can they be sure that an unforeseen side effect will not starve the Lions, causing them to break out and eat all the birds, all the monkeys and then break into the insect house and crush and eat the beehive?
Normally you can’t. There are just too many dependencies, the program needs to read the system time, the configuration details on what gets fed when, needs to read and write inventory levels, sends messages to the zookeepers, receives and processes input from them, interfaces with external systems etc.
However we really don’t want to risk the existing system breaking and the Lions having to dine on Finch, Chimps and Mushy Bees, so how can we enable tests?
The answer is to use the “separation of concerns” approach, where we have one class for database access, one for the user interface layer, one for talking to an external system etc. which initially I thought was good practice anyway as it enabled you to change the implementation of, say, your user interface layer without affecting anything else, but this also turns out to be vital for unit tests.
Also I have seen a whacking great example where someone split out all the database access into its own class, presumably following the separation of concerns model, but he made every single variable and method static, which proves he didn’t really understand the OO concept at all, as you can’t subclass static methods, and because of that you can’t test the methods of a such a class in your unit tests.
After you have isolated each dependency into its own class, you can use a technique called constructor injection (often called dependency injection) when creating an instance of the class under test, and then you can do unit tests on the productive code which will run in the development environment and not really try to read and write to the database, send emails etc., but which will test the business logic nonetheless.
For an example program where I use this technique to create a mock class for an external system please see my blog:-
http://scn.sap.com/community/abap/blog/2013/01/08/domain-specific-language-in-abap
When I started looking at this a few years back I thought “what a waste of time, 95% of my programs involve reading things from the database and updating other database tables, if you fake all of this using stubs what in the world is the benefit?”
It turns out that the remaining 5% of code is riddled with bugs and this shows up instantly during unit testing. If you don’t believe me, try it yourself and you will most likely discover the same thing on the very first attempt. Many times I have thought “this routine has only three lines of code – IT CAN’T BE WRONG – yet it was. In the ABAP2XLS project unit tests have started to be introduced, and there was a test on a three line method that failed and it took me ages to work out why.
Epilogue – Procedural Tests
Simon Says Put Your Hands On Your Head
I like to play a game, that is so much fun, and it’s not so very hard to do. The name of the game is “unit test for procedural programs” and I’d like for you to play it to. Tra la lala!
You may say to yourself “this procedural program works perfectly, it is business critical and reworking even slightly is far too dangerous” .You may well be right, this however does not rule out using the ABAP unit test framework. As we have said, if having tests is the most vital thing, then if your procedural code has unit tests and the OO programmer sticking his nose up at you does not, then the OO programmer is the one with the legacy code!
Often I have been in that very boat – let’s take a Z function module for example – in SE37 you can take the option “Goto -> Local Test Classes” and add the test class definition and implementation. In this situation, as the classes are local you can do PERFORM calls from within your test methods.
Now we find that having everything as a global variable actually works for us. At the start of each test method you need to do a bit of setup to make sure the global variables are in the state you want.
In the test driven development methodology you write the test first. This doesn’t make any sense in this context as we are dealing with an established program, but in my example I start with the actual problem we have found in production, which is causing me to change my monolithic program in the first place. I keep changing examples, in this one the application is one that checks to see if a supplier invoice will process correctly.
Legacy Figure 07
As always, to quote Frank N Furter, do not get too hung up on the way my examples look, i.e. what the application is trying to do, it’s the principle that matters.
In this case the GIVEN section sets up the input to the function module, the WHEN section calls the function module itself, and the THEN section evaluates the result. If ABAP let you have method names longer than 30 characters like Java and the like seem to allow, then you could have the whole thing in plain English with no need for comments at all.
There are two possibilities here:-
- Within your function module you have a clear separation between data retrieval (from the database or wherever) and processing of the data you just retrieved. In that case you are laughing. In the GIVEN section you set up those global tables with the fake values you want for the test, and in the WHEN section you call the PERFORMS from your function which deal with processing those global tables.
- If everything in the function is hopelessly intermingled, which is usually the case, then you would have to make some changes i.e. move the database retrieval into its own class as described above. If you want tests there is no way around this.
Legacy Figure 008
This is not the end of the world, it seems hopeless at the start, but if you break things up one little bit at a time you can get there in the end, though your colleagues may be really puzzled why you move all the database reads into one place.
Epilogue – Procedural Tests
Have I said too much? I haven’t said enough
If you get to the stage where you have broken all your dependencies by isolating various things into their own classes and you are in OO world to an extent, then there is nothing more for me to say as there is a flood of articles on the internet about how to proceed from that point. Here are some links to various articles on OO unit testing that I have found useful.
I have mentioned lots of these on my other blogs, but I cannot point people at them often enough:-
http://www.objectmentor.com/resources/articles/WorkingEffectivelyWithLegacyCode.pdf
http://dannorth.net/introducing-bdd/
http://scn.sap.com/community/abap/blog/2013/01/26/getting-the-brownfield-clean-but-not-green--part-i
As a last word, I may seem to say things as if I know it all, but that is far from the truth, that’s just me getting carried away, I am sure I have said a load of nonsense many times in the above blog, as well as jumping from subject to subject at random.
When I have made a mistake, or if you disagree in the slightest with anything I have said, please post a response. You will be doing me a favour, though to be honest I didn’t think the following comment I got was very positive – it was - “have you ever even USED SAP????”
Oh dear! Well anyway, I have in fact been using SAP every day since 1997, and the way I use it has changed dramatically in that period, but if I am still using it incorrectly, well I need to be told…
Cheersy Cheers from Down Under
Paul