Tag: pros

What Makes Good Code? – Should Every Class Have An Interface? Pt 2

Should Every Class Have an Interface?

This is part two in the sub-series of “Should Every Class Have an Interface?“, and part of the bigger “What Makes Good Code?” series.

Other Peoples’ Code

So in the last post, we made sure we could get an interface for every class we made. Okay, well that’s all fine and dandy (I say half sarcastically). But you and I are smart programmers, so we like to re-use other peoples’ code in our own projects. But wait just a second! It looks like Joe Shmoe didn’t use interfaces in his API that he created! We refuse to pollute our beautiful interface-rich code with his! What can we do about it?

Wrap it.

That’s right! If we add a little bit of code we can get all the benefits as the example we walked through originally. It’s not going to completely fix “the problem”, but I’ll touch on that after. So, we all remember our good friend encapsulation, right?

Let’s pretend that Joe Shmoe wrote some cool code that does string lookups from an Excel file. We want to use it in our code, but Joe didn’t use the IStringLookup interface (because… it’s in OUR code, not his) and he didn’t even use ANY interfaces. The constructor for his class looks like:


public ExcelParser(string pathToExcelFile);

On this class, there’s two methods. One method allows us to find the column index for a certain heading, and the other method allows us to get a cell’s value given a column and row index. The method calls looks like:


public int GetColumnIndex(string columnName);

public string GetCellValue(int columnIndex, int rowIndex);

We can wrap that class by creating a wrapper class that meets our interface, like so:


public sealed class ExcelStringLookup
{
  // ugh... we have to reference the class directly!
  private readonly ExcelParser _excelParser;

  // ugh... we have to reference the class directly!
  public ExcelStringLookup(ExcelParser excelParser)
  {
    _excelParser = excelParser;
  }

  public string GetString(string name)
  {
    var columnIndex = _excelParser.GetColumnIndex(name);
    // assumes all of our strings will be under a column header
    var cellValue = _excelParser.GetCellValue(columnIndex, 1);
    return cellValue;
  }
}

And now this will plug right into the rest of our code that we defined originally.

This doesn’t totally eliminate “the problem” though (the problem being that some class doesn’t have an interface (what this post is trying to answer)). There’s still a class we’re making use of that doesn’t have an interface, but it looks like we’ve reduced the exposure of that problem to JUST this class and the spot that would construct this class. Are we okay with that?

Thoughts So Far…

Let’s do a little recap on what we’ve seen so far:

  • Having interfaces for our classes is a nice way to introduce a layer of abstraction
  • Interfaces are just *one* tool to get layers of abstraction introduced
  • If you wanted to have interfaces for all of the classes in your code and some third party didn’t use interfaces, that code is likely not as common in your code base (especially if you wrap it like I mentioned above). This may not always be true in your code base, but it’s likely the case.
  • The amount of work to wrap things can vary greatly. Some things are straight forward to wrap, but you need to add many methods/properties. Sometimes it’s the inverse and you only have a few things to wrap but they’re not straight forward.
  • The number of classes you’d need to wrap to get to this state can vary greatly… Since even built-in System classes aren’t all backed with interfaces!
  • There’s certainly a trade off between the original work + maintenance to wrap a class in an interface versus the benefits it provides.

Is that last point blasphemy?! So there may actually be times we DON’T want to have an interface for a class?

Watch this space for part 3 where we start to look at a counter-example!

 


API: Top-Down? Bottom-Up? Somewhere in the Middle?

A Quick Brain-Dump on API Desgin

I’ll keep this one pretty brief as I haven’t totally nailed down my thoughts on this. I still thought it was worth a quick little post:

When you’re creating a brand new API to expose some functionality of a system, should you design it with a strong focus on how the internals work? Should you ignore how internals work and make it as easy to consume as possible? Or is there an obvious balance?

I find myself trying to answer this question without ever explicitly asking it. Any time I’m looking to extend or connect systems, this is likely to come up.

Most Recently…

Most recently I started trying to look at creating an API over AMQP to connect my game back-end to a Unity 3D front-end. I had been developing the back-end for a little while now, so I had a pretty good idea for how things needed to work from that perspective. The front-end? Not so much really. I knew some basic actions that I was considering, so I tried coding up an early API for them.

A lot of my focus was around how I was going to implement the code on the back-end to make this API work. It resulted in some of the API calls looking a little bit gross. But the idea was that if I settled on an API that would make the back-end easy to implement, I could get it up and running faster.

After a little while, I feel like my API isn’t getting any cleaner from a consumer perspective, and funny enough, it’s not actually as easy to implement as I was hoping on the back-end. Which had me reflect on a work example…

Once Upon a Time at Work…

Well, it was only a couple of months ago, really. I was working with a colleague on integrating a new system into an existing code base. We decided we wanted to approach the API problem from a consumer perspective. We said “let’s make this as easy to call and use as possible so that people ACTUALLY want to use it”.

We set out with this mission, and created a pretty simplistic API. The challenge? There was a lot of heavy lifting and a bit of voodoo going on behind the scenes. But you know what? We hid the magic in one spot of the code (instead of having ugly stuff scattered all over the code base) and it ended up being a very usable API.

So…

So does this consumer-first, top-down approach to API design always work? I’m not sure. Some similarities/differences in the scenarios:

  • In my current situation, I have a back-end implemented and very minimal code implemented on the caller side. In the work example, we had nothing implemented on the back-end, and a ton of code implemented where the caller side would be.
  • In both examples, at least one of the caller side or back-end side was reasonably well understood. For my game, the back-end was pretty well understood. For work, the caller side was pretty well understood and we had experience with what we’d call a “failed’ back-end implementation (that we were actually setting out to redesign).
  • The work example was a relatively small subset for an API, but the game example was about to be a very specific implementation that I’d need to adapt into a pattern for all messaging in my AMQP system

So there’s a few things to consider there. I think I’m at the point in my game where I’d like to revisit how I’m forming this API and try it from a client-first perspective. Now that I know some of the catches, maybe I’ll shed some new light!

How do you approach API design?


ProjectXyz: Enforcing Interfaces (Part 2)

Enforcing Interfaces

This is my second installment of the series related to my small side project that I started. I mentioned in the first post that one of the things I wanted to try out with this project is coding by interfaces. There’s an article over at CodeProject that I once read (I’m struggling to dig it up now, arrrrrghh) that really gave me a different perspective about using interfaces when I program. Ever since then I’ve been a changed man. Seriously.

The main message behind the article was along the lines of: Have your classes implement your interface, and to be certain nobody is going to come by and muck around with your class’s API, make sure they can’t knowingly make an instance of the class. One of the easiest ways to do this (and bear with me here, I’m not claiming this is right or wrong) is to have a hidden (private or protected) constructor on your class and static methods that let you create new instances of your class. However, the trick here is your static method will return the interface your class implements.

An example of this might look like the following:


public interface IMyInterface
{
    void Xyz();
}

public sealed class MyImplementation : IMyInterface
{
    // we hid the constructor from the outside!
    private MyImplementation()
    {
    }

    public static IMyInterface Create()
    {
        return new MyImplementation();
    }

    public void Xyz()
    {
        // do some awesome things here
    }
}

Interesting Benefits

I was pretty intrigued by this article on enforcing interfaces for a few reasons and if you can stick around long enough to read this whole post, I’ll hit the cons/considerations I’ve encountered from actually implementing things this way. These are obviously my opinion, and you can feel free to agree or disagree with me as much as you like.

  • (In theory) it keeps people from coming along and tacking on random methods to my classes. If I have an object hierarchy that I’m creating, having different child classes magically have random public APIs changing independently seems kind of scary. People have a harder time finding ways to abuse this because they aren’t concerned with the concrete implementation, just the interface.
  • Along with the first point, enforcing interfaces makes people think about what they’re doing when they need to change the public API. Now you need to go change the interface. Now you might be affecting X number of implementations. Are you sure?
  • Sets people up nicely to play with IoC and dependency injection. You’re already always working with interfaces because of this, now rolling out something like Moq or Autofac should be easier for you.
  • Methods can be leveraged to do parameter checks BEFORE entering a constructor. Creating IDisposable implementations can be fun when your constructor fails but and your disposable clean up code was expecting things to be initialized (not a terribly strong argument, but I’ve had cases where this makes life easier for me when working with streams).

Enforcing Interfaces in ProjectXyz

I’ve only implemented a small portion of the back-end of ProjectXyz (from what I imagine the scope to be) but it’s enough where I have a couple of layers, some different class/interface hierarchies that interact with each other, and some tests to exercise the API. The following should help explain the current major hierarchies a bit better:

  • Stats are simple structures representing an ID and a value
  • Enchantments are simple structures representing some information about modifying particular stats (slightly more complex than stats)
  • Items are more complex structures that can contain enchantments
  •  Actors are complex structures that:
    • Have collections of stats
    • Have collections of enchantments
    • Have collections of items

Okay, so that’s the high level. There’s obviously a bit more going on with the multi-layered architecture I’m trying out here too (since the hierarchies are repeated in a similar fashion in both layers). However, this is a small but reasonable amount of code to be trying this pattern on.

I have a good handful of classes and associated interfaces that back them. I’ve designed my classes to take in references to interfaces (which, are of course backed by my other classes) and my classes are largely decoupled from implementations of other classes.

Now that I’ve had some time to play with this pattern, what are my initial thoughts? Well, it’s not pure sunshine and rainbows (which I expected) but there are definitely some cool pros I hadn’t considered and definitely some negative side effects that I hadn’t considered either. Stay tuned

(The previous post in this series is here).


Continuous Improvement – One on One Tweaks

Continuous Improvement - One on One Tweaks

Continuous Improvement – Baby Steps!

Our development team at Magnet Forensics focuses a lot on continuous improvement. It’s one of the things baked into a retrospective often performed in agile software shops. It’s all about acknowledging that no system or process is going to be perfect and that as your landscape changes, a lot of other things will too.

The concept of continuous improvement isn’t limited to just the software we make or the processes we put in place for doing so. You can apply it to anything that’s repeated over time where you can measure positive and negative changes. I figured it was time to apply it to my leadership practices.

The One on One

I lead a team of software developers at Magnet, but I’m not the boss of any of them. They’re all equally my peers and we’re all working toward a common goal. One of my responsibilities is to meet with my team regularly to touch base with them. What are things they’ve been working on? What concerns do they have with the current state of things? What’s going well for them? What sort of goals are they setting?

The one on ones that we have setup are just another version of continuous improvement. It’s up to me to help empower the team to drive that continuous improvement, so I need to facilitate them wherever I can. Often this isn’t a case of “okay, I’ll do that for you” but a “yes, I encourage you to proceed with that” type of scenario. The next time we meet up, I check in to see if they were able to make headway with the goals they had set up and we try to change things up if they’ve hit roadblocks.

No Change, No Improvement

I had been taking the same approach to one-on-ones for a while. I decided it was time for a change. If it didn’t work, it’s okay… I could always try something else. I had a good baseline to measure from, so I felt comfortable trying something different.

One on ones often consisted of my team members handing me a sheet of past actions, concerns, and status of goals before we’d jump into a quick 20 minute meeting together. I’d go over the sheet with them and we’d add in any missing areas and solidify goals for next time. But I wanted a change here. How helpful can I be if I get this sheet as we go into the room together?

I started asking to get these sheets ahead of time and started paraphrasing the whole sheet into a few bullet points. A small and simple change. But what impact did this have?

Most one on ones went from maxing out 20 minutes to only taking around 10 minutes to cover the most important topics. Additionally, it felt like we could really deep dive on topics because I was prepared with some sort of background questions or information to help progress through roadblocks. Myself and my team member could blast through the important pieces of information and then at the end, if I’d check to make sure there’s nothing we’d missed going over. If I had accidentally omitted something, we’d have almost another 10 minutes to at least start discussing it.

Trade Off?

I have an engineering background, so for me it’s all about pros and cons. What was the trade-off for doing this?

The first thing is that initially it seemed like I was asking for the sheets super early. Maybe it still feels like I’m asking for them early. I try to get them by the weekend before the week where I start scheduling one on ones, so sometimes it feels like people had less than a month to fill them out. Is it a problem really? Maybe not. Maybe it just means there’s less stuff to try and cram into there. I think the benefit of being able to go into the meeting with more information on my end can make it more productive.

The second thing is that since I paraphrase the sheet, I might miss something that my team member wanted to go over. However, because the time is used so much more effectively, we’re often able to cover anything  that was missed with time to spare. I think there’s enough trust in the team for them to know that if I miss something that it’s not because I wanted to dodge a question or topic.

I think the positive changes this brought about have certainly outweighed the drawbacks. I think I’ll make this a permanent part of my one on one setup… Until continuous improvement suggests I should try something new!


Yield! Reconsidering APIs with Collections

Yield! Reconsidering APIs with Collections (Image by http://www.sxc.hu/)

Yield: A Little Background

The yield keyword in C# is pretty cool. Being used within an iterator, yield lets a function return an item as well as control of execution to the caller and upon next iteration resume where it left off. Neat, right? MSDN documentation lists these limitations surrounding the use of the yield keyword:

  • Unsafe blocks are not allowed.
  • Parameters to the method, operator, or accessor cannot be ref or out.
  • A yield return statement cannot be located anywhere inside a try-catch block. It can be located in a try block if the try block is followed by a finally block.
  • A yield break statement may be located in a try block or a catch block but not a finally block.

So what does this have to do with API specifications?

A whole lot really, especially if you’re dealing with collections. I personally haven’t been a big user of the yield keyword, but I’ve never really been forced to use it. After playing around with it for a bit, I saw a lot of potential. I’ve written before about what I think makes a good API. In my article, I was making a point to discuss two perspectives:

  • Who needs to implement your interface. You want it to be easy for them to implement.
  • Who needs to call your interface. You want it to be easy for them to use.

In my opinion, the IEnumerable<T> interface was a tricky thing to work with as a return value. You can essentially only iterate an IEnumerable, and at the time of calling a function, maybe that’s not what you want to do. The flip side is that for the person implementing the interface, IEnumerable<T> is a really easy interface to satisfy. However, the yield keyword has opened up some new doors.

In this article, I’d like to go over a couple of different approaches for an API and then explain why the yield keyword might be something you consider next time around. Disclaimer: I’m not claiming anything I’m about to present is the only way or the best way–I’m just sharing some of my own findings and perspective.

Interface For Returning Collections

The first type of API I’d like to look at is for returning collections. Based on my own API guidelines, I’d ideally choose an interface or class to return that provides a lot of information to the caller that is also easy to create for the implementer of my interface. The List<T> class is a great choice:

  • It’s easy to construct
  • It’s built-in to the .NET framework
  • It provides many handy functions (All of the IList<T> functionality as well as things like AddRange(), or functions that support delegates)

My next choice might be to have a return type of IList<T>, which would provide a little less ease of use to the caller, but make it even easier for the implementer of the interface. They could return arrays of type T, since an array implements the IList<T> interface, or their own custom list implementation that doesn’t inherit from the List<T> class. The differences between using IList<T> and List<T> are arguable pretty small.

A third alternative, which I would have avoided in the past, is to return an IEnumerable<T>. My opinion used to be that this made the life of the interface implementer a bit easier compared to returning an IList<T>, but complicated the life of the caller for a couple of reasons:

  • The caller would have to use the results of the function in a foreach loop.
  • The caller would have to add the items to their own collection to be able to do much more with the items.

My naive implementations of being forced to return an IEnumerable<T> were… well… crap. I would have constructed a collection within the function, fill it up, and then return it as an IEnumerable<T>. Then as the caller of my function, I’d have to re-enumerate the results (or add it to another collection):

public static IEnumerable<T> GetItems()
{
  var collection = new List<T>();
  // add all the items to a collection
  return collection;
}

private static void Main()
{
  var myCollection = new List<T>();
  myCollection.AddRange(GetItems());
  // use myCollection...

  // or.....
  foreach (var item in GetItems())
  {
    // use the items
  }
}

Seems like overkill to me with that implementation. However, we’ll examine how using yield can truly transform this into something… better. So to reiterate, a few potential implementations for an API involving collections might be:

  • Return a List<T> class
  • Return an IList<T> (or even an ICollection<T>) interface
  • Return an IEnumerable<T> interface

Constantly Creating Collections

My design decisions, in the past, were really driven by two guidelines:

  • Make it easier for the person implementing/extending the API
  • Make it easy for the person consuming the API

As I quickly illustrated in the first section, this meant that I would have a method where I would create a collection, fill it with items, and then return it. I could generally pick any concrete collection class and return it since I would usually pick a simple collection as the return type. Easy.

One thing that might be noticeable with this approach is that it looks pretty inefficient to keep creating new collections, fill them, and then return them. I’ll illustrate with a simple example. We’ll create a class that has a method on it called GetItems(). As per my reasoning presented earlier, we’ll have this method return a List<T> instance, and to make this example easier to work with, we’ll pass in an IEnumerable<T> instance. For what it’s worth, the input to this function is really just for demonstration purposes here–We’re really focusing on how we’re creating our return value.

public class CreateNewListApi<T>
{
  public List<T> GetItems(IEnumerable<T> input)
  {
    var newCollection = new List<T>();

    foreach (var item in input)
    {
      newCollection.Add(item);
    }

    return newCollection;
  }
}

And now that we have our simple class we can mock up a little test for performance… Just how inefficient is creating new lists every time?

internal class Program
{
  private static void Main(string[] args)
  {
    const int NUM_ITEMS = 100000000;
    var inputItems = new int[NUM_ITEMS];

    Console.WriteLine("API Creating New Collections");
    var api = new CreateNewListApi<int>();

    var watch = Stopwatch.StartNew();
    var results = api.GetItems(inputItems);

    foreach (var item in results)
    {
    }

    Console.WriteLine(watch.Elapsed);
    Console.WriteLine(Process.GetCurrentProcess().PrivateMemorySize64);
    Console.ReadLine();
  }
}

When I run this on my machine, I get an average of about 1.73 seconds. The memory printout I get when running is 1615908864 bytes. So is that slow? Is that a lot of memory usage? I think it’s pretty hard to say conclusively without being able to compare it against anything. So let’s keep this number in mind as we continue to investigate the alternatives.

Side Note: At this point, some readers may be saying “Well, if the input to our function was also a list (or if whatever our function has to work with was otherwise equivalent to our return value) then we wouldn’t have to go populate a new collection every time… We can just return the underlying collection”! And I would say you are absolutely correct. If your function has access to an instance of the same type as the return type, then you could always just return that instance. But what implications does this have? You’re now giving people access to your underlying internals, and they can go modify them as they please. So, if you need to control access to items being added or removed, then it might not make sense for you to expose your internal collections like this.

Yield to Incoming API Alternatives

We’ve seen how my past implementations may have looked, so how might we tweak this? If we tweak our API a bit, we can make our method return an IEnumerable<T> instead. Let’s see what that might look like:

public class YieldingApi<T>
{
  public IEnumerable<T> GetItems(IEnumerable<T> input)
  {
    foreach (var item in input)
    {
      yield return item;
    }
  }
}

So in this API implementation, all we’ll be doing is iterating over some type of collection and then yielding each result. If we run it through the same type of test as out previous API implementation, what kind of results do we end up with?

internal class Program
{
  private static void Main(string[] args)
  {
    const int NUM_ITEMS = 100000000;
    var inputItems = new int[NUM_ITEMS];

    Console.WriteLine("API Yielding");
    var api = new YieldingApi<int>();

    var watch = Stopwatch.StartNew();
    var results = api.GetItems(inputItems);

    foreach (var item in results)
    {
    }

    Console.WriteLine(watch.Elapsed);
    Console.WriteLine(Process.GetCurrentProcess().PrivateMemorySize64);
    Console.ReadLine();
  }
}

When I run this on my machine, I get an average of about 2.80 seconds. The memory printout I get when running is 449409024 bytes. How does this relate back to our first implementation? Well, it’s certainly slower. It takes about 1.62x as long to enumerate using the yield implementation as it did with the first API we created. However, yield also uses less than 1/3 (about 27.8%, actually) of the memory footprint when compared to the first implementation. Pretty cool results!

Site Note: So yield was a bit slower according to our results, but what happens if print the elapsed time before we run that foreach loop? Well, on my machine it averages at about one millisecond. Now that’s fast, right?! The cool thing about using yield with the IEnumerable<T> interface is that the work is deferred. That is, not until the program goes to actually run the enumeration do we get our performance hit. Try it out! Try moving the time printout from after the foreach loop to before the foreach loop. Try sticking breakpoints in on the line that yields. You’ll see what I mean.

Summary

In this article, I’ve explored two different ways of implementing an API (specifically focusing on the return value). We saw a brief performance analysis between the two and I highlighted some differences in both approaches. Let’s recap though:

  • Approach 1: Returning a List<T> and creating the collection ahead of time
    • Appeared to be overall a bit faster then yielding.
    • Consumed much more memory than yielding.
    • Callers can use the results immediately for enumeration, checking count, or as a collection to add more things to
    • The return type of List<T> is a bit more restrictive than an IEnumerable<T> like in the second API implementation
  • Approach 2: Return type of IEnumerable<T> and yielding results
    • Appeared to be overall a bit slower than the List<T> implementation
    • Lazy. We don’t actually execute any enumeration code until the caller actually enumerates
    • Consumed significantly less memory than the first approach using List<T>
    • Callers can enumerate the results immediately, but they need to add the results to a collection class to do much more than enumerate

So next time you’re designing an API for your interfaces and classes, try keeping these things in mind!

EDIT (December 30th, 2013):
As per some comments on Google+ by Dan Nemec, I figured I’d add a bit more here in the summary. IEnumerable<T> on it’s own is certainly not useless, especially if you’re leveraging LINQ or extension methods. My main beef in the past was that the consumer of an API with a IEnumerable<T> return value can only iterate over the results… And that’s just because that’s all that IEnumerable<T> lets you do. Dan made a great point though–If you are leveraging things like extension methods, or LINQ (which introduces tons of handy extension methods for working with IEnumerable<T>) then you get all of that functionality tacked on to IEnumerable<T>.

So if you’re not fortunate enough to be working with LINQ or extension methods (i.e. working with legacy code in old .NET framework versions… and yes I am familiar with the attribute you can add in to allow extension methods provided you have a compiler version high enough to support it), then IEnumerable<T> sometimes just plain sucks. I’d wager the majority of C# developers aren’t in this boat though, so I’d like to thank Dan again for his comments.


Singletons: Why Are They “Bad”?

Background

The very first thing I want to say is that I don’t think singletons are inherently bad–even if it means I am cast away from the rest of the programming world. There’s a time and a place for the singleton. It’s really common for people to get caught up with their perspective on something that they outright refuse to acknowledge the other side of it. I’d also like to point out that if you have a strong opinion on something and you find that other people also have a strongly opposing opinion on the same thing, there’s probably good take away points from either side. In this post though, I’m going to focus on why singletons are “bad”, because for me it means acknowledging one of the two main perspectives–that they are the best thing since cat videos met The Internet or they are the worst thing since Justin Bieber.

Let’s clarify what a singleton is here so we’re all on the same page. Maybe you’re under the impression it’s something slightly different than what I’m about to be talking about, so I’d rather make it clear from the beginning. And for what it’s worth, if this isn’t the exact meaning as set in stone by the singleton gods, then that’s sort of unfortunate… because this is going to be what I’m discussing.  From Design Patterns: Elements of Reusable Object-Oriented Software by the Gang of Four, a singleton must:

Ensure a class has only one instance and provide a global point of access to it.

So, with that incredibly complex definition, let’s get into it.

 

Global Dependencies

There’s approximately 3.2 billion articles on The Internet that will tell you that global variables are the enemy. I mean, I could do some of the work for you, but you could check out this, or this, or this… There’s still billions more. Usually when we have ~99% of people agreeing on something, they probably have a pretty good point, right? You’ll notice after a bit of searching that one *big* problem with singletons is the fact that they are just a different way to dress a global variable. Thus, all of the arguments for why global variables are bad could be used against singletons.

There are a few (quite a few) major problems with having global dependencies:

  • You’ll be putting yourself at risk for dealing with deadlocks if you need to lock your resources
  • Your singleton can become the resource bottleneck in your application
  • It’s hard to know who you’ll affect by modifying the global variable
  • Testing becomes difficult because the tests may depend on the state of the singleton
  • Dependencies aren’t obvious from examining the API

Quite simply, global dependencies can be pretty scary. If you and you’re team are experienced enough, trying to weave in some new features or heavily modify code relying on singletons may not be that challenging for you. Maybe. But testing can get super messy.

When you write unit tests, you want to ensure that each test can be run independently of the others. You need complete control over your state. This can be a big problem if you run two tests in a row that depend on state provided by a singleton. Consider the following set of tests that depend on a singleton in a sample run:

  • Test1 increases the count property on a singleton by 1 as part of the side effect of the test.
  • Test2 also increases the count property of the same singleton instance by 1, just as Test1 did. However, Test2 is explicitly testing this property as part of it’s validation to ensure the value transitioned from 0 to 1.

When you run Test1 all by itself, it behaves as you expect. Great! Now you move onto Test2. Simple. You got it working. You’re all excited so you run them all together. Uh oh… The tests failed! But how? You don’t believe it, so you run them all again. Now they pass! This is a crappy spot to be in. If the test suite executes Test2 before Test1, your tests will work. If the test suite executes them in the opposite order, your tests will fail because the singleton instance will have had its count property increased twice. The global state of the singleton comes back to haunt you.

 

Tight Coupling

Most experienced developers know that you want to decrease coupling and increase cohesion because this, generally speaking, makes for a good API and extensible code base. Now this point is related to the global dependency points I was previously stating, but it deserves to be addressed on it’s own. The global dependency topic was more of a focus on the application at run time. That is, it becomes difficult to know and manage how your application behaves at run time when you share global state across the entire thing. It’s challenging to to start adding new functionality to your application or modify existing components of your application that depend on the global state because the dependencies just keep growing.

Coupling, in my opinion, is more of an issue with compile time. With singletons, you start to design classes that depend on singletons and thus you are relying on a specific implementation (Maybe you could check out dependency injected singletons?). Singletons, generally speaking, are a single instance of a concrete class. You start to code classes and object hierarchies that are then depending on this concrete class, and this can potentially (and I say potentially, because I would never claim it’s “always”) lead you to a dead end. Let’s consider a scenario:

We’re coding an application that uses a singleton for the data access layer of our application. We have a MySQL data model that we’ve coded as a singleton. We expose a few methods to read and write records to the database using some SQL, and things are great. Then, one day, we speak to our customers like we usually do to ensure we’re meeting their expectations. They inform us that we really need to be able to support a document database like MongoDB in addition to MySQL. Hold on. Wait. You mean our code that uses our data layer needs to be agnostic to the database under the hood?! But… But our singleton is only able to deal with MySQL… (I wrote about how to get yourself out of this situation here, although I would not claim it’s the ideal scenario). If we would have been passing around references to something that met a nice and clean model interface spec, we’d likely be able to hide the implementation details and completely avoid this problem.

 

Singletons Disrupt The API

Because I like to design complex systems in code, API and architecture is something I like to focus on. There’s an awesome posting over here by Miško Hevery about this very problem. He does a great job describing the problem with examples, so I’ll only try to summarize some of the main take-away points I got from it.

There’s nothing to stop someone from putting some heavy initialization logic in a singleton. I mean, you shouldn’t (because you can’t guarantee when this is going to happen!) but there’s nothing that prevents it. As such, simply calling a constructor on some class (which you might assume to happen pretty quickly) actually ends up taking seconds. Not a couple milliseconds… But seconds. Oh. I guess you didn’t realize your class was trying to call a singleton instance that was connecting over The Internet to some host on the other side of the planet during its initialization. Surprise. Singletons can mask this kind of stuff because it’s not explicit in the API. It kind of goes back to coupling but I wanted to point out that this kind of stuff can get scary.

This can be completely mitigated by incorporating the dependencies right into the API. There’s no hiding crazy database initialization or downloading data from the internet in a constructor (well… it just reduces the likelihood of you doing it so easily I guess). You use interfaces and construct classes in the order that you need them, and this doesn’t end up being some magical process that happens behind a curtain. If one class depends on another, so be it (it’d be nice if it just depended on the interface…), but pass in the reference that you require. Singletons often end up being the shortcut, but what’s easier for you to code now may not be easier for someone to extend upon and understand in the future.

 

Summary

Singletons. You’ve likely seen them. You’ve likely heard bad things about them. You may have even used one yourself. Shame on you. Like all debatable things though, there’s always going to be another side to it. If you take away anything from reading this, I hope it’s that you question what the other perspective is. Fully understanding something requires you to look at all sides.

 

References

This article was based on information I obtained from the following sources (as well as my own experiences, of course!):


How Do You Structure Your Singletons?

Background

I’ll skip over the discussion about why many people hate singletons (that’s a whole separate can of worms, but worth mentioning) because in the end, it’s not going to get rid of them. A singleton is a design pattern that ensures only one instance of the singleton object can ever be constructed. Often, singletons are made to be “globally accessible” in an application (although, I’m still not confident that this is actually part of the definition of a singleton). There are a handful of different ways to implement singletons… so what are they?

 

The Not-Thread-Safe-Singleton

With singletons, the main goal is to ensure that only one instance can be created. Below is a snippet of singleton code that works, but is not guaranteed in a multi-threaded environment:

    internal class NotThreadSafeSingleton
    {
        private static NotThreadSafeSingleton _instance;

        /// 
        /// Prevents a default instance from being created.
        /// 
        private NotThreadSafeSingleton()
        {
        }

        /// 
        /// Gets the singleton instance (not thread safe).
        /// 
        internal static NotThreadSafeSingleton Instance
        {
            get
            {
                if (_instance == null)
                {
                    _instance = new NotThreadSafeSingleton();
                }

                return _instance;
            }
        }
    }

The key take away point here is that if this singleton class is accessed across multiple threads, it is possible that two threads perform the null check and begin the constructor at the same time. Because this block of code is not protected with a lock a race condition occurs.

 

The Thread-Safe Singleton

If the key problem with the first example was that it is not safe across threads due to lack of locking… Then it should be pretty obvious that we just need to lock!

    internal class ThreadSafeSingleton
    {
        private static readonly object _instanceLock = new object();
        private static ThreadSafeSingleton _instance;

        /// 
        /// Prevents a default instance of the class from being created.
        /// 
        private ThreadSafeSingleton()
        {
        }

        /// 
        /// Gets the singleton instance (thread safe).
        /// 
        internal static ThreadSafeSingleton Instance
        {
            get
            {
                if (_instance == null)
                {
                    lock (_instanceLock)
                    {
                        if (_instance == null)
                        {
                            _instance = new ThreadSafeSingleton();
                        }
                    }
                }

                return _instance;
            }
        }
    }

That looks sort of weird though, doesn’t it? The double-check serves two purposes: a required check for thread safety and a performance optimization! The check inside of the lock actually guarantees that a single instance is created across threads. However, the check outside is an optimization because after the first instance is created, there’s overhead to acquiring the lock just to check if the instance exists.

For what it’s worth, this is actually my preferred method of coding singletons. I’ve been coding them like this for a while in C#, and I find they work nicely. However, I stumbled upon a posting by John Skeet (essentially, a programming god) who’s done a much better job at documenting singleton patterns than I have. In his analysis, he actually notes some drawbacks to this pattern. Firstly, he mentions it doesn’t work in Java, which is a great point. I primarily code in C#, but this is still important to acknowledge. Secondly, he calls upon ECMA CLI memory specifications that state it’s not guaranteed to actually be thread safe without memory barriers. This was slightly out of my comfort zone, so I admit I still have yet to look into this. Finally, I do agree that the pattern is easy to get wrong for more junior developers and as for performance, I’ve never considered it an issue at all. (Hopefully by now you’ve come back from his write-up to finish mine!)

 

Generic Singleton: A Quick Hand At It…

One of the big drawbacks I saw with the thread safe singleton implementation that I favour is the amount of boilerplate code. Every time I want a new singleton class, I have to code basically everything you see in the previous section. It’s redundant and I don’t want to do it. I can hear the singleton-haters saying “so STOP making them!” but as I said, that’s a discussion for a later post.

I decided I’d have a quick attempt at making a generic singleton! Check out my implementation below (and be warned that I flipped some condition checks in order to try and reduce the width of the text to fit nicely here…):

    internal class Singleton<T> where T : class
    {
        private static readonly object _instanceLock = new object();
        private static T _instance;

        /// 
        /// Prevents a default instance of the class from being created.
        /// 
        private Singleton()
        {
        }

        /// 
        /// Gets the singleton instance (thread safe).
        /// 
        internal static T Instance
        {
            get
            {
                if (_instance != null)
                {
                    return _instance;
                }

                lock (_instanceLock)
                {
                    if (_instance != null)
                    {
                        return _instance;
                    }

                    const string SINGLETON_EXCEPTION_MSG =
                            "A single private parameterless constructor is required.";

                    var constructors = typeof(T).GetConstructors(
                        BindingFlags.Public |
                        BindingFlags.CreateInstance |
                        BindingFlags.Instance);

                    if (constructors.Length > 0)
                    {
                        throw new Exception(SINGLETON_EXCEPTION_MSG);
                    }

                    constructors = typeof(T).GetConstructors(
                        BindingFlags.NonPublic |
                        BindingFlags.CreateInstance |
                        BindingFlags.Instance);

                    if (constructors.Length != 1 ||
                        constructors[0].GetParameters().Length > 0 ||
                        !constructors[0].IsPrivate)
                    {
                        throw new Exception(SINGLETON_EXCEPTION_MSG);
                    }

                    _instance = (T)constructors[0].Invoke(null);
                    return _instance;
                }
            }
        }
    }

Doesn’t look to shabby, right? You simply need to create a class with *only* a private constructor and everywhere you want access to the singleton, you say Singleton.Instance. It’s that easy.

Unfortunately, this isn’t as good as I initially hoped. Can you think of an example where this approach breaks down? My hint to you is everything you need to break it is posted right here. Unfortunate really because I thought I was onto something good there! 🙂 In reality, it may not be a bad foundation for your singleton implementations if you don’t like the boiler plate code I listed above.

 

Summary

There are many ways to write singletons, and some are clearly more robust than others. I’ve only shown you a couple here, but John Skeet’s post has a handful more which he prefers. It’s hard to argue with what he’s written (although, I’m not actually fond of his fourth implementation which he likes) because he’s done a great analysis on them. In the end, it’s important to know what a singleton actually is, when you might want to use it, and what the differences actually mean in various implementations.


  • Nick Cosentino

    Nick Cosentino

    I work as a team lead of software engineering at Magnet Forensics (http://www.magnetforensics.com). I'm into powerlifting, bodybuilding, and blogging about leadership/development topics over at http://www.devleader.ca.

    Verified Services

    View Full Profile →

  • Copyright © 1996-2010 Dev Leader. All rights reserved.
    Jarrah theme by Templates Next | Powered by WordPress