Friday, June 18, 2004

Return-codes vs. Exceptions, part 516



User interface design for ProgrammersLet's do a little recap and summarize what I've covered so far regarding the never-ending debate between return-codes and exceptions.

Late last year, Joel set off a veritable firestorm of controversy by stating that, in short, exceptions were as hazardous to a developer as 'goto' statements. His reasoning:


- They are invisible in the source code. Looking at a block of code, including functions which may or may not throw exceptions, there is no way to see which exceptions might be thrown and from where. This means that even careful code inspection doesn't reveal potential bugs.

- They create too many possible exit points for a function. To write correct code, you really have to think about every possible code path through your function. Every time you call a function that can raise an exception and don't catch it on the spot, you create opportunities for surprise bugs caused by functions that terminated abruptly, leaving data in an inconsistent state, or other code paths that you didn't think about.


The Java and C# community erupted with dissent, including Brad, Jesse, Sergio, Ned, among others.

Over the past few weeks, I've spent quite a bit of time describing why I believe return-codes should be used for return-code handling (rather than brute-forcing exceptions into a return-code model). In short:

1) Exceptions tacitly encourage abdication of cleanup responsibility. Nearly every example I've seen highlights the "readability" of exceptions and they often use strained, hardened examples of try, catch and finally blocks that work well when addressing the return-code issue. But the problem is, very few people are disciplined enough to set try, catch and finally clauses for every single method. Look at some real, production code. Odds are high that you'll find few methods that are covered for try, catch and finally. And, everyone will agree (I'm sure) that in those cases, exceptions are no panacea for unwinding state. On my development teams (C++/C), I've counseled developers that they should do as Joel does. Don't throw any exceptions and catch everything. All methods and calls return codes. Logging and instrumentation must be performed on any high-level routines. Example below.

2) Exceptions are significantly more difficult to maintain. Review some source code that uses exceptions. Real source code. Not the strained, artificial-looking examples of perfect exception handling. Look at the method calls. And tell me which exceptions are going to get thrown by which methods (without relying on a cheat sheet) so that you can cleanly determine where you'll set state variables to recover from partial completion. Return-codes encourage the use of state variables for unwinding state. It's crystal clear in our architecture that all method calls should operate the same way. They return error-codes. And every single one needs to be checked.

3) Exceptions complicate tunable logging and instrumentation, a valuable -- but often omitted -- capability of reliable systems. I haven't seen an exception-based model that tackles my logging and instrumentation example, below. I'm still waiting. For a more details explanation of tunable logging, see this previous blog entry.

4) Exceptions add a needless abstration layer over the return-code based systems on which we rely. RPC/LPC systems (e.g., SOAP). Operating system calls. Device-driver interaction. Guess what, when we retrieve a SOAP document, we get a status code returned within it that tells us whether a failure occurred and, if so, the type of error. Thus, the baseline for interprocess communication is... return-codes. The baseline for operating system communication is... return-codes. And the baseline for device-driver software is... return-codes. Do we need an exception model pasted on top of the native return-code model? Actually, do we need lots of arbitrary exception models (given the fact that C++, C#, Java and other languages diverge significantly on actual exception implementation - types of exceptions handled in various subsystems and assemblies, checked vs. unchecked, etc.)? Do we need a bunch of incompatible exception models floating around?

5) Exceptions are, in the vast majority of eligible systems, not used (and even banned, in some cases) for mission-critical software: system software, drivers, real-time embedded systems, etc. Ever wonder why? Mostly, because of the reasons outlined above. Highly vetted code should be easily maintainable. Consistent in handling errors. Deterministic in terms of flow. Suitable for logging and instrumentation. And must strip away abstraction layers that can stand in the way of responsiveness to the real problems. Despite a minority of my colleagues who are, I'm certain, quite diligent about exceptions, most developers aren't consistent, diligent or even paranoid about their code. The majority probably considers exceptions a helpful abstraction intended to "simplify" error handling. The problem is, for these less-disciplined folks, exceptions end up luring developers into the same old trap. The siren-song of, "Don't worry about handling that, you're using exceptions... someone else will take care of it... don't worry...". Exceptions are for... exceptional conditions. Not routine error-handling.

Brad is good enough to list a variety of reasons why he believes exceptions are preferred. We'll recap each one.

"Exceptions Promote API Consistency... the Win32 API is a clear example of this inconsistency through BOOLs, HRESULTS, and GetLastError()". So, given that my operating system, RPC system, device drivers -- on virtually every platform -- generate return codes through either a direct result or a retrieval, how exactly do exceptions provide consistency? No platform supports exceptions at the API level. They are an abstraction tacked on to a return-code model (see #4, above).

"Exceptions are Compatible with Object Oriented Features... yet the developer writing the method has no choice in the return value (if any) of the method". The same problem exists with exceptions. If you plug into an OS feature that's unsupported by your OO language (e.g., a Win32 call you need to make), then you're back in the return-code world. If it's my team writing the code, all of the code we generate will be consistent. Each method will return codes. And they will be checked.

"With Exceptions, Error Handling Code Need not be Near Failing Code..." That's not a good thing. I've illustrated this before and it is illustrated in my example, below. I want error-handling code processing events and unwinding state every step of the way. As if my life depended on it. Because it might.

"...With return-value based error reporting error handling code is always very near to the code that could fail. However, with exception handling the application developer has a choice. Code can be written to catch exceptions near the failure point, or the handling code can be further up in the call stack..." Ahhh, we get to my point #1: Exceptions tacitly encourage abdication of cleanup responsibility. Someone else will NOT take care of it, so don't abdicate the responsibility for the exception!

"With Exceptions, Error Handling Code is More Localized..." Uhhh, yeah. We have to check the return-codes. If you disassemble your exception-handling code, you'll find that, yes, it uses compare op-codes (if statements) to handle the processing of the exceptions. It's just been abstracted out for you, which point #4 addresses.

"Exceptions Are Not Easily Ignored..." Sure they are. Joel's point: "They are invisible in the source code" is highlighted by the frequent servlet stack dumps I receive while using some popular web sites. We've all seen them. Exceptions are easily ignored in a team environment. "Oh, I thought that method was/wasn't going to throw that type of exception". Remember Brad's comment, "...with Exceptions, Error Handling Code Need not be Near Failing Code...". It's called buck-passing. And it's as easy to ignore a critical exception as it is to fail to check a return-code. It's the unwinding logic that's needed, no matter which type of system is used.

"Exceptions Allow for Unhandled Exception Handlers... Today, Microsoft Office uses an unhandled exception handler to gracefully recover and re-launch the application as well as send error information to Microsoft to improve the product." True, if a Microsoft Office product crashes, it can send an error-report and re-launch itself. But my goal is to have zero crashes. A catastrophic exception handler, though, is useful and is not germane to this discussion. This discussion is about handling normal error conditions, not blue-screen-worthy explosions.

"Exceptions Provide Robust Error Info for Logging..." I'm still waiting for someone to tackle the exception version of my tunable logging and instrumentation method, below.

"Handling Errors Consistently and Thoroughly are Essential for Creating a Good User Experience..." True. But this has nothing to do with either exceptions or return-codes. It's about consistently tracking state... and unwinding state when necessary.

"Exceptions Promote Instrumentation..." Again, I'm still waiting for someone to tackle the exception version of my tunable logging and instrumentation method, below.

"Exceptions Unify the Error Model (Hard and Logical Errors)..." Sorry, there are too many types of exceptions... handled differently in too many languages... for this to truly be the case. Maybe someday. And my point #4 stands in stark constrast to this... the fact is, the underlying platforms we work with are all based on return-codes.

And one of my favorite commentators, Raymond Chen pointed out some flaws in Brad's article...


"you don't have to worry about things ending up in partial states"

You still have to worry about partial states - the "catch" and "finally" clauses need to clean up any partial state that may have been caused by an exception thrown while the data structures were unstable.

And how many people wrap their entire function inside a try/finally to clean up partial states?


Exactly. My point #1. For the vast majority, exceptions actively encourage abdication of responsibility for the error.

Jesse's exception example is interesting:



void DrawFloodFilledBitmap()
{
Bitmap *b = null;
try
{
b = new Bitmap(width, height);
// do something
b.FloodFill(x,y,color, fillType); // yah, I know this isn't a real method }
catch(Exception ex)
{
// handle errors
GenericException ge = new GenericException(“Could not draw bitmap“);
ge.InnerException = ex;
throw ge;
}
finally
{
if(b != null) b.Dispose();
}
}


Hmmm... we get a generic exception that indicates the bitmap could not be drawn. No reason indicated. Hmmm... not quite as helpful as a return-code, I'd posit.

The JOS forum had a good discussion going around this topic... the majority siding with exceptions. I think I know why. :-)

Anyhow, here's my promised example. How can I create an exception-based version of this that supports tunable logging and/or instrumentation?




do { try {

if ((rc = tableCredits.open()) != OK) {
TunableLog("Credits open failed...");
break;
}
bUnwindTableCreditsOpen = TRUE;
TunableLog("Credits open succeeded...", 5);

if ((rc = tableCredits.lock()) != OK) {
TunableLog("Credits lock failed...");
break;
}
bUnwindTableCreditsLock = TRUE;
TunableLog("Credits lock succeeded...", 5);

if ((rc = tableDebits.open()) != OK) {
TunableLog("Debits open failed...");
break;
}
bUnwindTableDebitsOpen = TRUE;
TunableLog("Debits open succeeded...", 5);

if ((rc = tableDebits.lock()) != OK) {
TunableLog("Debits lock failed...");
break;
}
bUnwindowTableDebitsLock = TRUE;
TunableLog("Debits lock succeeded...", 5);

if ((rc = ::IntegralTransaction(tableCredits, tableDebits, curAmount)) != 0) {
TunableLog("Integral transaction failed...");
break;
}
TunableLog("Integral transaction succeeded...", 3);

} catch (...) {
// catch miscellaneous exceptions here
} } while (0);

if (bUnwindTableDebitsUnlock) {
tableDebits.unlock();
}
if (bUnwindTableDebitsOpen) {
tableDebits.close();
}
if (bUnwindTableCreditsLock) {
tableCredits.unlock();
}
if (bUnwindTableCreditsOpen) {
tableCredits.close();
}
return (rc);


And the winner is...

Now that I've rested my slam-dunk case, I'll declare a tie. It's a tie. It doesn't matter. Diligence about handling errors is the key.

If my life depended on the software (and it probably does, for many of the embedded CPU's in airliners, automobiles, etc.), i would want a meticulous approach to error-handling taken. Every method call checked.

So, consistency -- whether you're using return-codes or exceptions -- is the key. If the only thing that comes out of this debate is that more developers are cognizant of the consistency issue (e.g., the try, catch, finally clauses are always applied... or the do... while structured approach is consistently used), then it will all have been worthwhile.

p.s., don't even get me started on asserts!

No comments: