Making the same mistakes
There is a mistake in software engineering that we, as engineers, keep making again and again and again. We apparently are really bad about resource management or we're really naive. I can't decide which. Probably both.
We already learned that we were way too trusting of memory management, so most modern systems use garbage collection to handle that for us. As it turns out, most of these systems are pretty heinous about resource management that doesn't fall into the memory category. So in Java, we created the finally() method, except that there's no guarantee that it will ever be called, so it's useless. In .NET we can use the IDisposable pattern (see
IDisposable Made E-Z) or create some other kind of convention.
This is the problem. When you create a convention, you need to trust that forever and ever, all clients of your code will correctly adopt this convention and follow it to the letter. The problem is that we're people and we're just not very good about following directions.
I really like stack allocated variables in C++ in that I can make something happen when they're constructed and also something can happen when they go out of scope. This is tremendous and it bugs me that they don't really exist in C#. In fact, I'd like to see the ability to make stack allocated variables other than structs so that I can make them come and go automatically and reliably. This makes it so easy to prevent resource leaks because you can follow an Acquire/Release model implicity in the lifetime of a call.
The .NET event pattern is another case of this. When you add an event handler to an event, IntelliSense is really nice about offering to implement the full call to construct your new event delegate and then to also implement the stubs for the method itself. This is great, except that I've noticed that most of the time this code is not at all what you really want.
Using our OcrEngine as a model, you could do something like:
private OcrDocument DoRecognize(OcrEngine engine, ImageSource source)
{
engine.DocumentProgress += new OcrDocumentProgressEventHandler(_engine_DocumentProgress);
return engine.Recognize(source);
}Which feels natural and IntelliSense saves you a bunch of typing. The problem is that if this method gets called several times with the same engine object, it will attach many event handlers onto it, none of which will get released, nor can they be. There is no way to effectively iterate over the listeners to an event and find one, because we don't know what to look for. So instead, you need to add and remove the handlers:
private OcrDocument DoRecognize(OcrEngine engine, ImageSource source)
{
OcrDocumentProgressEventHandler docHandler = new OcrDocumentProgressEventHandler(_engine_DocumentProgress);
engine.DocumentProgress += docHandler;
OcrDocument doc = engine.Recognize(source);
engine.DocumentProgress -= docHandler;
return doc;
}
Now this will work better, but the code looks substantially worse. And it's also wrong - if there is an exception in the processing (and trust me, there is a lot that can go wrong), the exception will blow through this without cleaning up anything, so your engine will still have an extra event handler on it. OK, try three:
private OcrDocument DoRecognize(OcrEngine engine, ImageSource source)
{
OcrDocumentProgressEventHandler docHandler = new OcrDocumentProgressEventHandler(_engine_DocumentProgress);
engine.DocumentProgress += docHandler;
try {
OcrDocument doc = engine.Recognize(source);
return doc;
}
finally {
engine.DocumentProgress -= docHandler;
}
}
This will work just fine, but the code looks horrible.
In C++, I'd be inclined to make a helper class that installs the event handler on construction and removes it on destruction, so the code might look like this:
//...
private:
OcrDocument *DoRecognize(OcrEngine *engine, ImageSource *source)
{
DocProgressInstaller installer(engine, new OcrDocumentProgressEventHandler(_engine_DocumentProgress));
return engine->Recognize(source);
}
//...
class DocProgressInstaller {
public:
DocProgressInstaller(OcrEngine *engine, OcrDocumentProgressEventHandler *handler) : m_engine(engine), m_handler(handler)
{
m_engine->DocumentProgress += m_handler;
}
virtual ~DocProgressInstaller() { m_engine -= m_handler; }
private:
OcrEngine *m_engine;
OcrDocumentProgressEventHandler *m_handler;
}
And lo and behold, the code where this is used looks very nice and is readable. This tedious code is hidden behind the scenes and operates correctly during a throw. But we're talking C# here, so it doesn't work.
Again, if I could have the ability to make variables do something nice when they go out of scope, I'd be very happy - so happy that I'd willing accept some new keywords in the language to make that apparent and to allow some new semantics. I imagine that you could either use stack or local as the keyword and use the following semantics:
stack variables need to be passed in to a method that declares a parameter as a stack variable (a la ref and out). Stack variables are destroyed when the variable goes out of scope. Stack variables cannot be assigned to, from, or returned.
That's a lot of restrictions but it does some great stuff: it allows me, the developer, to decide where the memory comes from and takes a fair amount of pressure off the garbage collector (and that's a great thing, IMHO. I talked with a former co-worker who had done a project that involved doing static analysis of garbage collected programs and he created a versions that would do stack allocation and he got substantial performance increases that way). It allows resource management to happen automatically on clear boundaries. It makes my code easier to read.
Wanting to make something come and go at particular points in time is a recurring CS problem. Shouldn't it be an easy one to do right instead of an easy one to do wrong?