Welcome to Atalasoft Community Sign in | Help

Seek and Ye Shall Find

I was building on some benchmarks on some heavy stream handling code that I wrote.  It’s part of a simple state machine lexer that tokenizes input.  One bottleneck I found was surprising – read on.

The lexer uses one of my favorite patterns: peek.  This uses the ability to look ahead precisely one byte in a stream to help determine the end of a token.  I had an obvious implementation of peek in my scanner:

private int LLPeekChar()
{
    int i = _stream.ReadByte();
    if (i >=0)
        _stream.Seek(-1, SeekOrigin.Current);
    return i;
}

This is straight forward, but I had quite a surprise when I ran a substantial chunk of text through my lexer:

Routine Name    Time    Time with Children    Shared Time Hit Count
LLPeekChar 1.98 1.98    100.00    473683

Now, most people would think that 2 seconds for 473,683 iterations is not bad.  That’s .004 milliseconds per call.  I wasn’t happy, though – this is not a heavy operation and I knew when I wrote it that I would be unhappy – I already knew that Seek is an expensive operation.  So much so that this writing this line of code in any FileStream.Seek-heavy operation will make a huge difference:

if (postion != _stm.Position)
    _stm.Seek(position, SeekOrigin.Begin);

I’m not making this up – seeking to the same position costs more than catching that case.  I recommend that if you hate two argument seek as much as I do that when you write an extension method to do the most common case, that you write it this way:

public static long Seek(this Stream stm, long position)
{
    if (position == stm.Position)
        return position;
    return stm.Seek(position, SeekOrigin.Begin);
}

I decided to rework this by writing an adapter class on Stream.  This class would adapt a Stream but add the ability to un-get a single byte.  This is not unlike the C Standard IO call ungetc().

What I do is build a Stream subclass that is constructed with a Stream.  This will be a read-only Stream, to keep it simple.

Since most of the code is brief, I’ll put in the entire implementation.  The code uses a single byte buffer (this is easily extended, but I really don’t need or want the overhead of a full stack of byte – exercise left to the reader, and all that).  I also use a magic number to tell if the buffer byte is valid or not.  You could use a bool flag, but again – simple.  The only thing baroque in this implementation is that I keep track of how much to alter the stream Position property.  I could test the buffer and alter it that way, this to me makes the code less error prone, especially if it grows in the future.

using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
namespace Atalasoft.StreamTesting
{
    internal class UngetStream : Stream
    {
        const int kInvalidByte = -2;
        Stream _stm;
        long _streamOffset = 0;
        int _b = kInvalidByte;
        public UngetStream(Stream stm)
        {
            _stm = stm;
        }
        public override bool CanRead
        {
            get { return true; }
        }
        public override bool CanWrite
        {
            get { return false; }
        }
        public override int ReadByte()
        {
            if (_b >= 0)
            {
                int val = _b;
                _b = kInvalidByte;
                _streamOffset = 0;
                return val;
            }
            return _stm.ReadByte();
        }
        public override int Read(byte[] buffer, int offset, int count)
        {
            int nRead = 0;
            if (_b >= 0)
            {
                buffer[offset++] = (byte)_b;
                count--;
                _b = kInvalidByte;
                _streamOffset = 0;
                nRead++;
            }
            if (count > 0)
            {
                return nRead + _stm.Read(buffer, offset, count);
            }
            return nRead;
        }
        public override long Seek(long offset, SeekOrigin origin)
        {
            _b = kInvalidByte;
            _streamOffset = 0;
            return _stm.Seek(offset, origin);
        }
        public override long Position
        {
            get
            {
                return _streamOffset + _stm.Position;
            }
            set
            {
                Seek(value, SeekOrigin.Begin);
            }
        }
        public override long Length
        {
            get { return _stm.Length; }
        }
        public void Unget(byte b)
        {
            if (_b >= 0)
                throw new IOException("attempt to unget twice");
            _b = b;
            _streamOffset = -1;
        }
        public override void Write(byte[] buffer, int offset, int count)
        {
            throw new NotImplementedException();
        }
        public override void SetLength(long value)
        {
            throw new NotImplementedException();
        }
        public override void Flush()
        {
            throw new NotImplementedException();
        }
        public override bool CanSeek
        {
            get { return true; }
        }
    }
}

This is just a two-state entity: ungotten byte pending or not pending.  So my implementation of LLPeekChar changes to this:

private int LLPeekChar()
{
    int i = _stream.ReadByte();
    if (i >= 0)
        _stream.Unget((byte)i);
    return i;
}

Now when I measure the performance, I get the following result:

Routine Name    Time    Time with Children    Shared Time Hit Count
LLPeekChar 0.01 0.03    43.92    473683

The performance is now 66x faster if you include the performance of Unget.  More importantly, a 2 second hit in the total time running this app vanishes into the realm of imperceptible, as it should.  The total time per iteration is now 6.3 x 10-5 milliseconds or 63 nanoseconds.  I can live with that.  Can you?

As a final note, I did not preemptively optimize this code.  I did a full day of tests resulting in me doing some heavy refactoring – I had a routine that was taking 14 seconds of my total runtime which is now also taking 0.03 seconds.  When I was done with everything else significant within my control, LLPeekChar was the highest time thief and this type of optimization is clean and simple.

Published Tuesday, April 20, 2010 4:08 PM by Steve Hawley

Comments

No Comments
Anonymous comments are disabled