Welcome to Atalasoft Community Sign in | Join | Help

WorldComp and IPCV 08 Day 4 - Wrap Up

Day 4 was the last day and for me it was a blur. There were a number of sessions that were missing half their speakers, and my eyes were starting to glaze over from some of the talks that featured a 'wall of math'. One talk that stood out was a math professor who was spearheading an effort to create open source code to cleanly and efficiently process images from ultra low-cost CCD cameras. He explained the problem domain very well. It really made it clear as to why most cheap camera output looks like garbage. I did my presentation in the last slot on the last day. I can't say I was happy about the timing, but you gotta dance with them what brung ya. I tried to make the best of it by bringing candy to the talk and to put as much energy into it as possible. My thought was that the people who stuck through this should get rewarded, not punished for sticking with it. I felt the talk went well and I think I'll put together several blogs about how to present a technical paper. I can't say that I'm an expert, but having taught middle and some high school, I did get some good experience explaining the technical to those who were on the edge of understanding.
Posted by Steve Hawley | 0 Comments

WorldComp and IPCV 08 Day Three

The length of the conference is certainly taking its toll. Between the long days, my body has been trying wake me up in a different time zone. Day three was dedicated to multimedia and compression. There were a number of talks that were beyond my attention span but looked worth investigating more. We also cross-pollinated a little more and looked at conferences outside of imaging, including research to try to read emotion from spoken words. There was one talk that was short and sweet which was a small bit of research to determine if compression of data for transmission is worth the cost in energy. I found it in triguing that for a particular class of processor the cost of an ADD instruction is 85 nanojoules. The idea is to determine if the cost of battery power used in running the code to compress and subsequently transmit the data is less than the cost of transmitting uncompressed data on its own. Rick and I had a nice talk with him afterwards.
Posted by Steve Hawley | 0 Comments

WorldComp and IPCV Day 2

Day two had a great deal of work concerning feature extraction. Yesterday seemed to be about putting things into images and today was getting things out. A few things that caught my eye were some segmentation and classification work, hand tracking and identification, face tracking with pose estimation, automatic location of cracks in a highway, location of crosswalks for the blind, circle detection, and detection of pollution. One thing that is particulrly striking about this conference is its extensive multiculturalarity. I seen attendees and presenters from Spain, Italy, France, UK, Japan, Korea, China, Vietnam, India, and so on. It's quite a melange of language and culture and is a great reminder of how important it is to write culture neutral software. It's also a strong reminder of the inherent biases towards English in pretty much every accepted programming language. Thinking about this makes me think about the parallels of Latin and English for the publication of scholarly work. English is a fairly terrible language (yet some how millions of children a year manage to become conversant in it), and it must be a tremendous strain for a presenter to have to field questions in it. But it still makes me wonder if 17th century and earlier scholars faced the same challenges in communication. And lest I forget, the gender diversity is also a welcome change when coming from an industry that is traditionally male dominated.
Posted by Steve Hawley | 0 Comments

WorldComp and IPCV 08 Day 1

I'm blogging from the road in Vegas and posting this from my phone. I'm using Opera on a WM6 device and it seems to work admirably. Our flight was delayed on the outbound side so we ended up getting 4 hours of sleep in a hotel in Milwaukee before heading on to Vegas. After a few hotel snafus, we got in to catch the last keynote about reconfigurable computing. Nifty work. A few notes in no particular order: * It's nice to see such a diverse crowd. This is truly an international conference. * The material has been fairly fascinating and there are some interesting topics. I can totally see how to create digital watermarks in images based on one of the talks, so if you need that as an Atalasoft customer, let me or sales know. * Saw some interesting work on fingerprint recognition that would apply to document binarization. * Found myself asking critical questions in nearly every session. It's good to geek out. * At the conference dinner, Rick and I picked our table and companions more or less at random and had some good partners. Time to clean up for day 2.
Posted by Steve Hawley | 1 Comments

Off to IPCV08

Next week Rick and I will be in Las Vegas at IPCV08, part of WorldComp.  In addition to being a huge industry conference with a lot of interesting sessions, I'll be presenting a paper on accelerating the Hough Transform.  My session is on Thursday July 17, 4:40 in Ballroom 7 of the Monte Carlo Resort.

In the meantime, here's an app with source that I will touch on in my talk. 

Posted by Steve Hawley | 0 Comments

Don't Be Stupid (like me)

I had a frustrating situation last week.  I was trying to debug some unmanaged code through a managed project and discovered that something had happened to my system such that it was impossible to start up a project with unmanaged debugging turned on.  If you had it switched off, any project would run fine in the debugger.  If you had it switched on, the app would crash hard with an ExecutionEngineException before entering Main.

I can't tell you what changed on my system to cause this.  I had a few suspicions, but nothing that I could act upon.  I reinstalled VisualStudio hoping that would solve the issue, but no go.

After losing two days, I chose to have my machine nuked from orbit.  I made a checklist of all the apps that I would lose and copied my documents (stored in My Documents) to an external hard drive.  I made one very stupid mistake. I assumed that source control was my friend (more so than any reasonable person should) and neglected to make a back up of my source tree.  When my machine was nuked it took out all the files I had checked out.  Was it really 93?  Yikes.  Most of those were files that I was conditionally compiling out of the existing code, so some 80 or so of those were easy to recreate by breaking the locks and redoing the work of adding in

#if USE_OLD_CODE_LIBRARY
...
#endif

Fortunately, all the "hard" work I had done had been checked in and I lost only a small amount of that code.  What I had lost was mostly managed code, for which I still had copies of the compiled dll's in test rigs on my backup.  I pulled those and used .NET Reflector to get back an approximation of my original code.  In two days, I was back to where I was and in better shape, as I fixed one bug along the way just from reading code (and it turns out, fixed the bug I was looking for when managed debugging went away).

In total, this situation cost me four days and my stupidity cost me two of those.  Hindsight tells me that I was not scrupulous enough in making a checklist of things to do before nuking a machine.  "What about my source?" should have been number one on my list.

It occurs to me as well that this is exactly something that a tightly integrated source code system like tfs should be able to manage for me.  Tfs knows when I save a file locally.  Couldn't it also shadow my changes into the server as well?  In this way my code is one save away from recovery instead of one check-in or shelve. 

How to Build a Managed/Unmanaged Library

If you are tasked with exposing an unmanaged library through managed code, there are several approached that you can take.  The approach you take will depend upon what format your unmanaged code is in.

If you are given an unmanaged dll, it is sensible to simply use P/Invoke to expose the functionality.  This will work and will get you going quickly, but it leads to a problem of how to make sure that the dll you have is loaded.  If you leave it to the OS, this means that the unmanaged dll needs to live in the same folder as the calling assembly or in the system folder.  You can also change the path that is searched for loading a dll by calling a P/Invoke of SetDllDirectory, or you can manually load the dll in your own code.  If you do that, your dll loader needs to be called before the first P/Invoke into the dll.

Using P/Invoke works, but may be costly in the marshaling, especially if you have to call the routine repeatedly.

What I prefer to work with is an unmanaged static library.  With that, you can build a managed C++ wrapper that exposes the functionality that you need.  The C++ compiler does some fairly amazing things in terms of knowing when to do unmanaged/unmanaged transitions, but sometimes it does some surprising things that will cost.

Let's assume that you are a good developer and have built a customer management API using STL collections.   You might expose a method like this:

public __gc class CustomerRelations {
public:
    bool IsCustomerAvailable(String *name)
    {
        StringAdapter cstringName(name);
        vector<Customer *> customers = GetCustomers(); // lib call
         for (int i=0; i < customers.size(); i++) {
           if (cusomters[i]->MatchesName(name))
            return true;
         }
         return false;
   }
};

The problem here is that the API is chatty - it's nice to have the interface right to the metal of the vectory, but the question is, when this is compiled and linked, will the STL code be managed or unmanaged.  Worse than that, if the compiler generates a managed version of one of the STL routines, it will use it in all the unmanaged code as well.  In other words, code that you thought was supposed to be pure unmanaged will be doing very granular transitions between managed and unmanaged calls, and that will be be pretty much out of your control.

One way around this is to make sure that all code that is touching unmanaged APIs is also unmanaged.  This can be handled by writing classes or top level functions that are themselves unmanaged and less chatty.

This works, but is prone to error - if you screw up, you may inadvertently tank your performance.  I found this out, by the way, by stepping into an unmanaged library routine with unmanaged debugging off.  The debugger kept executing code until it hit the next unmanaged frame, which was in the middle of deep library code in STL.

Here's the flawless solution - make the exposed API as chunky as possible.  It's really tempting to give direct access to collections, but this is problematic.  It's far better, if you can, to use managed collections to hold your unmanaged objects.

One approach to make sure that you don't expose anything is to use the old-school C trick of making your API completely opaque.  You can do this by making two header files, one public and one private.  The public one defines opaque types:

// public API

#pragma managed(push, off)

namespace StupendoCustomer {
typedef struct t_CustomerStruct *t_CustomerHandle;
extern t_CustomerHandle GetCustomerByName(wchar_t *name);
extern bool IsCustomerAvailable(wchar_t *name);
extern INT32 GetCustomerCount();
extern void GetCustomers(t_CustomerHandle *arr, INT32 startIndex, INT32 count);
} // namespace


#pragma managed(pop)

 

// private API - in a different file
namespace StupendoCustomer {
const UINT32 kCustomerMagicNumber = 0x00C5708e6; // arbitrary
typedef struct t_CustomerStruct {
    UINT32 m_magic;
    CustomerObject *m_customer;
};
} // namespace

 

In this case, managed  code will see nothing more than an anonymous struct for the customer, whereas unmanaged code gets the actual definition of the struct.  The m_magic field is a trick to put in a sanity check so that if the lower level API gets passed a pointer to something other than the real structure, it can be checked quickly.

The GetCustomerCount/GetCustomers pair is a way to show how you can hide the collection code by letting the caller allocate space for the array and then fetch elements from it.  The calling code, presumably managed, would most likely wrap the objects in its own class and make a collection of those objects.

Beware that you understand who owns the pointer to the struct and who is responsible for disposing it (might also be a job for smart pointers).  Chances are that if you're writing a managed wrapper for the customer object, it will have to implement IDisposable.

A Geek Chooses Lunch

Not so much on programming, but on geekery.

One local lunch spot faxes us a list of daily specials and I decided that I wanted pizza, but the question was whether to choose the $6.25 10" two topping pizza with a 12 ounce soda or split a $9.45 16" one topping pizza with a 2L birch beer.  I love birch beer, so that's not  a deciding factor.

What I did (and iteratively adjusted it with Elaine) was to normalize the meals into units of price per food, where the food units were square inch-liter-toppings. 

So in this case you take the price divided by the area of the pizza x toppings x volume of soda.  Ignoring the egregious mixing of metric and English units, this is fairly reasonable.  I did the math with a calculator, but for this blog, I popped this into a spread sheet and got the following results, using πr2 for area of the pizza:

The personal 2-topping pizza meal is 10.61 cents/in2-L-topping, whereas the large pizza is 2.35 cents/in2-L-topping.  This is normalized, so the value of the larger pizza is clearly greater.  In addition, the area of the personal pizza is 78.54 in2, whereas the area of the large pizza is 201.06 in2, therefore my share of the large is greater than the whole small.  The soda is similar a better deal.

Rick and I shared a large with Buffalo chicken as the topping and there are left-overs for today.  In addition, we answered the trivia question about encephalitis and got free garlic bread sticks.  It was a good lunch.  And yes, we really are that geeky.

Woo-hoo! Compiler Bug!

I just finished tracking down a Visual C++ compiler bug.  This bug exists in the optimizer in VC 2003 and 2005.  It does not exist in VC 2008.  It only happens with optimization turned on (ie, Release build).

Finding this bug was interesting in that we had a defect reported from a customer that caused an exception to be raised when trying to load a particular image.  This image had an error in it and was causing a throw deep within the codec.  This looked like an easy bug to find. Unfortunately for me, unit tests would pass on my machine, but not on the build server.  This turned a routine bug into a nightmare bug - the debugger was not likely to help me.  What I ended up doing was taking the library that (most likely) contained the bug and built it release, and used that in the debug build on my machine.  This allowed me to reproduce the bug (and proving that it was a release bug).  Using divide and conquer, I found where the exception was being thrown and to my great surprise, where the catch was being totally missed.

The bug is this:

if you have a try/catch block in C++ that only calls "pure" C code, the optimizer sees an opportunity to remove the catch block as C can't throw.  If however, the C code calls back into C++ and that code throws, the exception will blow past the handler.

It is arguable that C++ code called from C should never throw without catching before returning to the caller.  If the catch block is outside the C, it could cause memory or resource leaks.  In this particular case, the function being called was a function pointer acting as an event handler.  Basically, it was signaling to the caller, "something bad has happened - I've cleaned up, now it's your turn".

To illustrate how this happens, I've created a minimal code example to reproduce it.  This code is a simple C++ app that calls a C function, PerformOperation with a C++ callback.  PerformOperation prints a message than calls the callback if it's non-null.  The C++ implementation of the callback throws an empty class to signal a failure.  The calling code should catch this error and report failure.  In debug, you get correct behavior.  In release, you get an unhandled exception.

 File - Perf.h - a header describing an external operation in C:

#ifndef _H_Perf
#define _H_Perf

#ifdef __cplusplus
extern "C" {
#endif

typedef void (*fPerformer)();

extern void PerformOperation(fPerformer pf);

#ifdef __cplusplus
}; // _cplusplus
#endif

#endif

File - Perf.c - an implementation of the function PerformOperation

#include "Perf.h"
#include "stdio.h"

#ifdef __cplusplus
extern "C" {
#endif

void PerformOperation(fPerformer pf)
{
    printf("Performing operation...");
    if (pf)
        pf();
}

#ifdef __cplusplus
};
#endif
 

File - OptimizerBug.cpp - calls the performer from C++ with a C++ callback that throws

#include <iostream>
#include <tchar.h>
#include "Perf.h"

using namespace std;

class PerfError { };

static void CPPPerform()
{
    throw PerfError();
}

int _tmain(int argc, _TCHAR* argv[])
{

    bool fail = false;
    try {
        PerformOperation(CPPPerform);
    }
    catch (PerfError) {
        fail = true;
    }

    cout << (fail ? "fail." : "pass.");

    return 0;
}

Here's what happens - in a debug build, you will see "Performing operation...fail." - this is correct output.  In a release build, the code will crash with an unhandled exception.  I need to also stress that the function PerformOperation needs to live in its own file.  If it lives in OptimizerBug.cpp, the optimizer goes even further and notices that PerformOperation can be inlined, and since it's only being used once, the callback can be inlined too and that it will always throw.  It's a nice chunk for call->call optimization, but it makes the bug go away.  If the implementation is in a different file, the optimizer doesn't inline.

Here's the assembly output for the release build in VC 2005:

; 20   :     bool fail = false;
; 21   :     try {
; 22   :         PerformOperation(CPPPerform);
;
; Here's the call to PerformOperation
;
    push    OFFSET ?CPPPerform@@YAXXZ        ; CPPPerform
    call    _PerformOperation

; 23   :     }
; 24   :     catch (PerfError) {
; 25   :         fail = true;
; 26   :     }
; 27   :
; 28   :     cout << (fail ? "fail." : "pass.");


;
; and here's the call to cout, operator << - you'll notice
; that the fail = true is not here.
;

    push    OFFSET ??_C@_05MFHHNNDH@pass?4?$AA@
    push    OFFSET ?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A ; std::cout
    call    ??$?6U?$char_traits@D@std@@@std@@YAAAV?$basic_ostream@DU?$char_traits@D@std@@@0@AAV10@PBD@Z ; std::operator<<<std::char_traits<char> >
    add    esp, 12                    ; 0000000cH

; 29   :
; 30   :     return 0;

    xor    eax, eax

There is a workaround - the most basic is to refactor to never throw in a C++ callback called from C.  Where this is not possible, the routine with the catch block needs to be surrounded by:

#if NDEBUG
#pragma optimize("", off)
#endif

#if NDEBUG
#pragma optimize("", on)
#endif

This bug is NOT fixed by changing the catch to catch(...) - the optimizer will take out the handler no matter what. 

On a more meta level, I want to talk more about bugs that happen only in release and not in debug.  These are among the most frustrating bugs as it looks like you have to shed your main tool for tracking them down - your debugger.  In my case, I was able to isolate the behavior and build that particular component with release.  You can still use the debugger in release, but it's not as useful as you might think since the optimizer may shift the order of operations of things and you will see some truly bizarre behavior.  For example, I watched the execution of an if (condition) statement where condition was false - and the debugger stepped into the block (!!).  This was because a lot of the method had been optimizer rearranged to reduce size and increase speed.  I find it easier to use the Disassembly window in VisualStudio so I can better see what the compiled code is.  While doing this, I spotted the missing catch block - quite the WTF.

Compiling, Linking, and Linq-ing

Reading through Rick's Flat File post, had me thinking about the benefits of having LINQ at your fingertips.

It strikes me that the process of figuring out which variables you're touching when you're compiling a line of code is really a database query.  Scoping and the semantics of scoping are part of the query (as well as how the database has been built).

Further, the actual link of a completed compile (whether or not it's being done at build time or run time), is another query.

The process of compilation should really be the process of building up a database.

That then ties into Rick's query as to why we're not using a database for our source instead of a flat file.  One answer is that we could be.  If there is a reflexive transformation that can turn a file into blobs, than we could already be there.

The reason to keep source in flat files has as much to do with tradition as it does with human factors.  People are very aware of space and spatial layout of things and this translates naturally into flat files.  People develop a familiarity with the layout of a file and can navigate very efficiently to the right location within it via muscle memory.  Taking that away, is to take away a significant information management skill.  It is important to replace it with one or more navigational methods that leverage similar or better skills.

Posted by Steve Hawley | 2 Comments

Portable APIs

Having looked at and created quite a few APIs, I thought I'd put together a few hints on how to create a good, portable API.

First, understand what you're trying to do.  Seriously.  Writing an application and writing an API are two different things.  I've seen some API's which were literally wrappers around a function declaration that looks hauntingly like C's main.  Methods in the "API" turned parameters into command-line arguments and passed them in to a single entry point.

Second, decide on your target platform(s) and what resources you'll need and what resources will need to be provided.  Choices could include UNIX flavors, Windows, Mac OS, JVM, .NET, etc.  This is an important early decision and will determine how much pain you may suffer in the future.

Third, decide on your I/O mechanism, if any.  If your answer is "file names", you're probably taking the wrong approach.  File names look like they're easy - just pass in a string, hand it to fopen() and off I go.  The problem with this is, what constitutes a path?  How long is it?  What's the path separator?  If I am creating the file, who gets ownership?  Who has permissions?  Are the permissions portable?  If components of the path don't exist, should the API create them or fail?  You didn't think you'd be creating so many problems with such a simple API, did you?  A streaming API is far more flexible and brings far fewer problems.  A stream is usually an abstract object that represents a data source or sink (or both) and offers methods to read, write, seek, get position, get length, flush, and close.

Fourth, decide what you need to model and how you're going to do it and then how it will be presented.  A perfect example is date and time.  There are lots of ways to represent such an object and chances are you will be reinventing the wheel.  Look at what your target platform offers before you go make something new.  If you are supporting multiple target platforms, you are best off trying to abstract the behavior you want and then implementing the concrete version - maybe in terms of something existing on that platform.  By the way, if you need to provide an abstract notion of something, please, please, please create concrete implementations for your target so you clients don't have to.  Using the stream idea above, if your target is Windows, do provide at least a stream that constructs from a file handle or a Windows file name.  If your target is .NET, include a version that works for Stream.  If you do this, your customers will love you.  If you don't, your customers will, at best, be annoyed by you.  For example, if you create an abstraction of a rectangle that you use in your API's and at least one of your targets is .NET, make sure that your special rectangles can be constructed from .NET System.Drawing.Rectangle and System.Drawing.RectangleF, and generate them as well.

Fifth, understand what you're wrapping fully and if it's obscure, be sure to take some time to explain it in your documentation.  Some things that people think they understand, but usually don't include typography, color and color representations, printing, and localization.  For each one of these, I can rattle off APIs that are clearly wanting for them (I'm looking at you Graphics.DrawString).

Working in Opposite Land

There are a number of interesting things that come up in computer science and computer engineering where reasoning suddenly jumps into opposite land.  It seems that the more human the operation, the more likely we are to go to opposite land.  For example, it has been measured that it is more efficient to send your engineers home at a reasonable hour when a project is late than it is to have them work long hours.  Efficiency doesn't scale with time spent, but exhaustion sure does.

My trip into opposite land has to do with finding and fixing bugs. From experience (and I wish I had the time to set up a proper study to examine this), I have noticed that I am more efficient in finding and fixing bugs by reading code.  Just reading - not time spent in the debugger.

Don't get me wrong - debuggers are awesome tools and I've spent enough time working on systems where the debugger was the equivalent of printf or worse, and I don't really miss that.

Debuggers, however, are good for isolating and reproducing bugs, but not for finding and fixing.  The downside to debuggers is that they tend to narrow your vision.  This can be a good thing - if your bug is an off-by-one error, swapped parameters, or a typo.  These bugs are trivial in nature.  The more significant bugs are those that are caused by faulty logic, poor design, hurried or incomplete coding, and so on.  With narrow vision, it is natural to apply the fix at the point of failure rather than making the actual fix.  More's the pity, because putting a fix at the point of failure makes a fragile fix and can simply hide the problem.  This isn't debugging - it's hacking.

Instead, you should read code with the goal of seeing the big picture as well as the little picture.  Could this be better fixed with a design change in the model than with a point fix?  Is the problem that there is a break in the contract between API and client?  Reading code lets you fix not just this bug but to also defend against bugs of the same type in the future.  Maybe it's a good time to apply a debug assert or a throw for parameters out of range and set the right unit test on it.  Reading also lets you get a better feel for flow than stepping with the debugger.  You can divide and conquer chunks of code by factoring if statements on assumptions and use that to devise further tests to make sure that you're fixing what you are intending.

And above all, recidivation is not an option.
 

In sum, when you've reproduced your bug, stop before you apply a spot fix.  Read.  Ask questions about the big picture and defend for future bugs.

Posted by Steve Hawley | 1 Comments

Lou Francos

I was playing with the Product Box Generator and created the following bit of fun:

I love software.

Thanks, Lou, for tolerating me. 

Posted by Steve Hawley | 1 Comments
Filed under:

When "new" is Too Slow

I was working on some prototype code that needed to be super fast.  After getting basic functionality up and running, I found that the biggest chunk of time was getting lost in operator new.  In this particular chunk of code, I had some simulated recursion that was making lots and lots of little objects that were being pushed and popped onto a stack.  The issue was that these objects are tiny and transient.  This means that all the overhead imposed by new and delete will get charged again and again and again.

There are a number of choices available to you when you do this.  I'll pass on one that breaks a number of conventional OO design rules, specifically is-a/has-a.  In my case, I had some objects that have two purposes in life - to hold data and to live in a stack.  If you are a fan of STL, you will probably haul out stack<t> and call it good.  In my case, STL's performance was heinous, so I chose to smudge the is-a/has-a line and make my little object into a stack element rather than using a stack element to hold it:

class StackPoint {
public:
    StackPoint(unsigned int x, unsigned int y) : m_next(NULL)
    {
        m_pt.x = x; m_pt.y = y;
    }
    StackPoint (POINT pt) : m_pt(pt), m_next(NULL) { }
    virtual ~StackPoint () { }
    int X() { return m_pt.x; }
    int Y() { return m_pt.y; }
    void SetX(int x) { m_pt.x = x; }
    void SetY(int y) { m_pt.y = y; }
    StackPoint *GetNext() { return m_next; }
    void SetNext(StackPoint *next) { m_next = next; }
private:
    POINT m_pt;
    StackPoint *m_next;
};


class MyStack {
public:
    MyStack() : m_stack(NULL) { }
    virtual ~MyStack();
    bool IsEmpty() { return m_stack == NULL; }
    void Push(StackPoint *item) { item->SetNext(m_stack); m_stack = item; }
    StackPoint *Pop();
private:
    StackPoint *m_stack;
};

 

In these snippets, I'm leaving out some of the methods, but they're simple to write.  Pop() returns null if the stack is empty, or the top of the stack otherwise, resetting the stack pointer.  The destructor rips the entire stack, disposing each element in turn.  From these two classes, I can now create an allocator like this:

class StackPointAllocator {
public:
    StackPointAllocator() {  }
    virtual ~StackPointAllocator() { }

    StackPoint *Alloc(int x, int y);
    void Free(StackPoint *p);
private:
    MyStack m_stack;
};

 In this case, the StackPointAllocator has a method called Alloc which will do one of two things.  If m_stack.IsEmpty() is true, it will call operator new.  If m_stack.IsEmpty() is false, it will pop the top item off, set x and y and return it.  In the typical case, there is an item left on the stack and so "allocation" (really recycling), is only a few assignment statements.  Note that m_stack is a NOT a pointer, which means that its destructor will automatically get called when the allocator goes out of scope or is deleted.

To give you a sense of the power of doing this, the original problem with allocation was costly because I was allocating tens of millions of these objects, each of which has a very short life span.Even though the individual cost was small, multiplying by 10M added up.  I'd rather do 40M assignments.  Once the code was refactored (easy), the time for allocation totally vanished from the performance profile.

Now, the real C++ programmer question is "Why didn't I override global operator new for StackPoint?"

Overriding global operator new means that all code that allocates these objects will go through the global allocator - that's a bigger deal.  I use this code in other places that shouldn't use this allocator.  I end up making the allocators I use stack objects, which means that objects in the free list disappear when they go out of scope.

Classic engineering dilemma: which is more important? performance or OO design?  In this case, the API is NOT public, so the dirty little secret is hidden away, making this a perfectly acceptable choice.  Performance, FTW.  In addition, there were two other considerations - readability and expedience.  Readability was not lost at all, and since I could knock this out inside a half hour, I won on expedience too. 

Posted by Steve Hawley | 2 Comments

Out, Out, Damn Ref

What is the difference between the out and ref keywords in C#?  Rick and I were discussing this earlier today and I decided that I wanted to find out for real.  Inherently, I know what the difference is: ref is passed by reference and out is pass by reference with the added requirement that the parameter must be assigned to.

But this is a very minor semantic difference, so the real question is whether or not the difference is enforced by the language/compiler or by the CLR.  For this, we will determine the answer empirically with this chunk of code:

private void SetMyFriend(out int x)
{
    x = 5;
}

private void TouchMyFriend(ref int x)
{
    x = 3;
}

private void MyFriend()
{
    int x;
    SetMyFriend(out x);
    TouchMyFriend(ref x);
}

 Once this is written and compiled, we can pull it up with ildasm and have a look at what the compiler has done.

Here is the IL for SetMyFriend()

.method private hidebysig instance void  SetMyFriend([out] int32& x) cil managed
{
  // Code size       5 (0x5)
  .maxstack  8
  IL_0000:  nop
  IL_0001:  ldarg.1  // put address of x on the stack
  IL_0002:  ldc.i4.5 // put a 5 on the stack
  IL_0003:  stind.i4 // store the 5 into the address
  IL_0004:  ret
} // end of method BasicTesting::SetMyFriend

No surprises - x is passed in as an int32& or a reference to a four byte int - ah, but it has the magic [out] attribute on it as well.

Here is TouchMyFriend():

.method private hidebysig instance void  TouchMyFriend(int32& x) cil managed
{
  // Code size       5 (0x5)
  .maxstack  8
  IL_0000:  nop
  IL_0001:  ldarg.1
  IL_0002:  ldc.i4.3
  IL_0003:  stind.i4
  IL_0004:  ret
} // end of method BasicTesting::TouchMyFriend

You'll notice that except for the [out] in the declaration, it is absolutely identical to SetMyFriend.  I won't insert the calling code, but it is precisely the same for both.  So the answer to the question is that the compiler enforces it to the degree that it will attempt to determine if the value has been assigned via static analysis.  Yet, is this 100% the case because the parameter is marked as [out].  Does the CLR do the verification?  The answer is no - if you write the following managed C++ method:

static void DontTouchMe([System::Runtime::InteropServices::Out]int __gc &x)
{
}

the C++ compiler lets it go by and at runtime, the CLR lets it work too.  This is moderately distressing, but not surprising.  Lesson: out is only fully honored in C#.
Posted by Steve Hawley | 2 Comments
More Posts Next page »