Steve's Tech Talk : Seek and Ye Shall Find

Seek and Ye Shall Find

I was building on some benchmarks on some heavy stream handling code that I wrote. It’s part of a simple state machine lexer that tokenizes input. One bottleneck I found was surprising – read on.

The lexer uses one of my favorite patterns: peek. This uses the ability to look ahead precisely one byte in a stream to help determine the end of a token. I had an obvious implementation of peek in my scanner:

private int LLPeekChar()
{
    int i = _stream.ReadByte();
    if (i >=0)
        _stream.Seek(-1, SeekOrigin.Current);
    return i;
}

This is straight forward, but I had quite a surprise when I ran a substantial chunk of text through my lexer:

Routine Name	Time	Time with Children	Shared Time	Hit Count
LLPeekChar	1.98	1.98	100.00	473683

Now, most people would think that 2 seconds for 473,683 iterations is not bad. That’s .004 milliseconds per call. I wasn’t happy, though – this is not a heavy operation and I knew when I wrote it that I would be unhappy – I already knew that Seek is an expensive operation. So much so that this writing this line of code in any FileStream.Seek-heavy operation will make a huge difference:

if (postion != _stm.Position)
    _stm.Seek(position, SeekOrigin.Begin);

I’m not making this up – seeking to the same position costs more than catching that case. I recommend that if you hate two argument seek as much as I do that when you write an extension method to do the most common case, that you write it this way:

public static long Seek(this Stream stm, long position)
{
    if (position == stm.Position)
        return position;
    return stm.Seek(position, SeekOrigin.Begin);
}

I decided to rework this by writing an adapter class on Stream. This class would adapt a Stream but add the ability to un-get a single byte. This is not unlike the C Standard IO call ungetc().

What I do is build a Stream subclass that is constructed with a Stream. This will be a read-only Stream, to keep it simple.

Since most of the code is brief, I’ll put in the entire implementation. The code uses a single byte buffer (this is easily extended, but I really don’t need or want the overhead of a full stack of byte – exercise left to the reader, and all that). I also use a magic number to tell if the buffer byte is valid or not. You could use a bool flag, but again – simple. The only thing baroque in this implementation is that I keep track of how much to alter the stream Position property. I could test the buffer and alter it that way, this to me makes the code less error prone, especially if it grows in the future.

using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
namespace Atalasoft.StreamTesting
{
    internal class UngetStream : Stream
    {
        const int kInvalidByte = -2;
        Stream _stm;
        long _streamOffset = 0;
        int _b = kInvalidByte;
        public UngetStream(Stream stm)
        {
            _stm = stm;
        }
        public override bool CanRead
        {
            get { return true; }
        }
        public override bool CanWrite
        {
            get { return false; }
        }
        public override int ReadByte()
        {
            if (_b >= 0)
            {
                int val = _b;
                _b = kInvalidByte;
                _streamOffset = 0;
                return val;
            }
            return _stm.ReadByte();
        }
        public override int Read(byte[] buffer, int offset, int count)
        {
            int nRead = 0;
            if (_b >= 0)
            {
                buffer[offset++] = (byte)_b;
                count--;
                _b = kInvalidByte;
                _streamOffset = 0;
                nRead++;
            }
            if (count > 0)
            {
                return nRead + _stm.Read(buffer, offset, count);
            }
            return nRead;
        }
        public override long Seek(long offset, SeekOrigin origin)
        {
            _b = kInvalidByte;
            _streamOffset = 0;
            return _stm.Seek(offset, origin);
        }
        public override long Position
        {
            get
            {
                return _streamOffset + _stm.Position;
            }
            set
            {
                Seek(value, SeekOrigin.Begin);
            }
        }
        public override long Length
        {
            get { return _stm.Length; }
        }
        public void Unget(byte b)
        {
            if (_b >= 0)
                throw new IOException("attempt to unget twice");
            _b = b;
            _streamOffset = -1;
        }
        public override void Write(byte[] buffer, int offset, int count)
        {
            throw new NotImplementedException();
        }
        public override void SetLength(long value)
        {
            throw new NotImplementedException();
        }
        public override void Flush()
        {
            throw new NotImplementedException();
        }
        public override bool CanSeek
        {
            get { return true; }
        }
    }
}

This is just a two-state entity: ungotten byte pending or not pending. So my implementation of LLPeekChar changes to this:

private int LLPeekChar()
{
    int i = _stream.ReadByte();
    if (i >= 0)
        _stream.Unget((byte)i);
    return i;
}

Now when I measure the performance, I get the following result:

Routine Name	Time	Time with Children	Shared Time	Hit Count
LLPeekChar	0.01	0.03	43.92	473683

The performance is now 66x faster if you include the performance of Unget. More importantly, a 2 second hit in the total time running this app vanishes into the realm of imperceptible, as it should. The total time per iteration is now 6.3 x 10-5 milliseconds or 63 nanoseconds. I can live with that. Can you?

As a final note, I did not preemptively optimize this code. I did a full day of tests resulting in me doing some heavy refactoring – I had a routine that was taking 14 seconds of my total runtime which is now also taking 0.03 seconds. When I was done with everything else significant within my control, LLPeekChar was the highest time thief and this type of optimization is clean and simple.

Published Tuesday, April 20, 2010 4:08 PM by Steve Hawley

Comments

No Comments

Anonymous comments are disabled

Steve's Tech Talk

This Blog

Syndication

Search

Navigation

Tags

Recent Posts

Archives

Seek and Ye Shall Find

Comments