Thursday, March 13, 2008

NValidate: Misunderstood from the Outset

Occasionally, I will post questions about the design or feature set of NValidate on Google Newsgroups. More recently, I posted a question about it to LinkedIn. Almost immediately, I got this response:

I'd suggesting looking at the Validation Application Block portion of the Enterprise Library from the Microsoft Patterns and Practices group.

Now, I'm not belittling the response, because it's perfectly valid, and the Validation Application Block attempts to solve essentially the same problem. But when I talk about NValidate, which I find myself doing a lot as I interview for jobs (it's listed on my résumé), people often ask me questions like it:

  1. How is that any different from the Validator controls in ASP.NET?
  2. Why don't you just use the Validation Application Block?
  3. Why didn't you go with attributes instead?
  4. Why didn't you use interfaces in the design?
  5. Why not just use assertions instead of throwing exceptions?

These days, I find myself answering these questions with alarming frequency. It occurs to me that I should probably get around to answering them, so I'm going to address them here and now.

It helps, before starting, to understand the problem that NValidate is trying to solve: Most programmers don't write consistent, correct parameter validation code because it's tedious, boring, and a pain in the neck. We'd rather be working on something else (like the business logic). Writing parameter validation code is just too difficult. NValidate tries to solve that problem by making it as easy as possible, with a minimal amount of overhead.

Q. How is NValidate any different from the Validator controls in ASP.NET?

A. The Validator controls in ASP.NET can only be used on pages. But what if I'm designing a class library? Isn't it vitally important that I make sure I test the parameters on my public interface to ensure that the caller passes me valid arguments? If I'm not, I'm going to fail spectacularly, and not in a pretty way. You can't use the Validator controls (RangeValidator, CompareValidator, and so on) in a class library you're writing that's intended to be invoked from your Web application.

Q. Why don't you just use the Validation Application Block?

A. This one's pretty easy to answer. NValidate is designed to accommodate lazy programmers (like me).

Here's the theory that essentially drives the design of NValidate: Developers don't write parameter validation code with any sort of consistency because it's a pain in the neck to write it, and because we're in a big hurry to get to the business logic (the meat and potatoes of the software). Let's face it: if the first chunk of the code has to be two to twenty lines of you checking parameters and throwing exceptions, and doing it all over the place, you'd get tired of doing it, too. Especially if that code is extremely repetitive.

if(null == foo) throw new ArgumentNullException(foo);
if(string.Empty == foo) throw new ArgumentException("foo cannot be empty.");
if(foo.length != 5) throw new ArgumentException("foo must be 5 characters.");

We hate writing this stuff. So we skip it, thinking we'll come back to it later and write it. But it never gets done, because we get all wrapped up in the business logic, and we simply forget. Then we're fixing bugs, going to meetings, putting out fires, reading blogs, and it gets overlooked. And the root cause is because it's tedious and boring.

I'm not making this up, folks. I've talked to lots of other developers and they've all admitted (however reluctantly), that it's pretty much the truth. We're all guilty of it. Bugs creep in because we fail to erect that impenetrable wall that prevents invalid parameter values from slipping through. Then, we have to go in after the fact and add the code after we've got egg on our face and fix it, at increased cost.

So, if you want to make sure that developers will write the parameter validation code, or are at least more likely to do it, you have to make it as easy as possible to do so. That means writing as little code as possible.

Now, if we look at the code sample provided by Microsoft on their page for the Validation Application Block, we see this:

using Microsoft.Practices.EnterpriseLibrary.Validation;
using Microsoft.Practices.EnterpriseLibrary.Validation.Validators;
public class Customer
{
    [StringLengthValidator(0, 20)]
    public string CustomerName;
    public Customer(string customerName)
    {
        this.CustomerName = customerName;
    }
}

public class MyExample
{
    public static void Main()
    {
        Customer myCustomer = new Customer("A name that is too long");
        ValidationResults r = Validation.Validate<Customer>(myCustomer);
        if (!r.IsValid)
        {
            throw new InvalidOperationException("Validation error found.");
        }
    }
}

A couple of things worth noting:

  1. You have to import two namespaces.
  2. You have to apply a separate attribute for each test.
  3. In your code that invokes the test, you need to do the following:
    1. Declare a ValidationResults variable.
    2. Execute the Validate method on your ValidationResults variable.
    3. Potentially do a cast.
    4. Check the IsValid result on your ValidationResults variable.
    5. If IsValid returned false, take the appropriate action.

That's a lot of work. If you're trying to get lazy programmers to rigorously validate parameters, that's not going to encourage them a whole lot.

On the other hand, this is the same sample, done in NValidate:

using NValidate.Framework;
public class Customer
{
    public string CustomerName;
    public Customer(string customerName)
    {
        Demand.That(customerName, "customerName").HasLength(0, 20);
        this.CustomerName = customerName;
    }
}

public class MyExample
{
    public static void Main()
    {

        try
        {

            Customer myCustomer = new Customer("A name that is too long");

        }
        catch(ArgumentException e)
        {
            throw new InvalidOperationException("Validation error found.");
        }
    }
}

A couple of things worth noting:

  1. You only have to import one namespace.
  2. In the property, you simply Demand.That your parameter is valid.
  3. In your code that invokes the test, you need to do the following:
    1. Wrap the code in a try...catch block.
    2. Catch the exception and handle it, if appropriate.

See the difference? You don't have to write a lot of code to validate the parameter, and your clients don't have to write a lot of code to use your class, either.

Q. Why didn't you go with attributes instead?

A. I considered attributes in the original design of NValidate. But I ruled them out for a number of reasons:

  1. Using them would have meant introducing a run-time dependency on reflection. While reflection isn't horrendously slow, it is slower than direct method invocation, and I wanted NValidate to be as fast as possible.
  2. I wanted the learning curve for adoption to be as small as possible. I modeled the public interface for NValidate after a product I thought was pretty well known: NUnit. You'll note that Demand.That(param, paramName).IsNotNull() is remarkably similar to NUnit's Assert.IsNotNull(someTestCondition) syntax.
  3. In NValidate, readability and performance are king. Consequently, it uses a fluent interface that allows you to chain the tests together, like so:

    Demand.That(foo, "foo").IsNotNull().HasLength(5).Matches("\\d5");

    This is a performance optimization that results in fewer objects created at runtime. It also allows you to do the tests in a smaller vertical space.

My concerns about attributes and reflection may not seem readily apparent until you consider the following: it's conceivable (in theory) that zealous developers could begin validating parameters in every frame of the stack. If the stack frame is sufficiently deep, the costs of invoking reflection to parse the metadata begins to add up. It may not seem significant yet, but consider the scenario where any one of those methods is recursive; perhaps it walks a binary tree, a DOM object, an XML document, or a directory containing lots of files and folders. When that happens, the costs of reflection can become prohibitively expensive.

In my book, that's simply not acceptable. And since, as a framework developer, I cannot predict or constrain where a user might invoke these methods, I must endeavor to make it as fast as possible. In other words, take the parameter information, create the appropriately typed validator, execute the test, and get the hell out as quickly as possible. Avoid any additional overhead at all costs.

Q. Why didn't you use interfaces in the design?

A. I go back and forth over this one all the time, and I keep coming back to the same answer: Interfaces would tie my hands.

Lets assume, for a moment, that we published NValidate using nothing but interfaces. Now, in a subsequent release, we decided we wanted to add new tests. Now we have a problem. We can't extend the interfaces without breaking the contract with clients who are built against NValidate. Sure, they'll likely have to recompile anyway; but if I add new methods to interfaces, they might have to recompile lots of assemblies. That's something I'd rather not force them to do.

On the other hand, abstract base classes allow me to extend classes and add new tests and new strongly typed validators fairly easily. Further, it eliminates casting (because that's handled by the factory). If, however, the system is using interfaces, some methods will return references to an interface, and some will return references to strongly typed validators, and some casting will have to be done at the point of call. I want to eliminate manual casting whenever I can, to keep that call to Demand.That as clean as possible: the cleaner it is, the more likely someone is to use it, because it's easy to do.

Q. Why not just use assertions instead of throwing exceptions?

A. This should be fairly obvious: Assertions don't survive into the release version of your software. Additionally, they don't work as you'd expect them to in a Web application (and rightly so, since they'd kill the ASP.NET worker process, and abort every session connected to it. [For a truly educational experience, set up a test web server, and issue a Visual Basic Stop statement from a DLL in your Web App. You'll kill the worker process, and it will be reset on the next request. Nifty.]).

Wisdom teaches us that the best laid plans of mice and men frequently fail. Your most thorough testing will miss some points of your code. The chances of achieving 100% code coverage are pretty remote; if you do it with a high degree of frequency, I'm duly impressed (and I'd like to submit my resume). But for the rest of us, we know that some code never gets executed during testing, and some code gets executed, but doesn't get executed under the precise conditions that might reveal a subtle defect. That's why you want to leave those checks in the code. Yes, it's additional overhead. But wouldn't you rather know?

In Summary

Sure, these are tradeoffs in the design. But let's keep in mind who I'm targeting here: lazy programmers who are typically disinclined to write lots of code to validate their parameters. The idea is that we want to make it so easy that they're more likely to do it. In this case, less code hopefully leads to more, which (I hope) leads to fewer defects, and higher quality software.

Wednesday, January 30, 2008

Still Alive!

So, it's probably not surprising that everyone thinks that NValidate is dead. There's been no activity on the forums since November or so, and the blog hasn't seen a post since October 24th. Since the alpha release, things have been pretty darned quiet.

A lot of that has to do with the nature of my job, and the shortage of time it imposes on me. I am the sole developer at a company that churns out massive amounts of code to support a government agency. It's a grueling task that requires me to wear 14 hats, while only getting paid for one of them. The job pretty much consumes my life, and by the time I get home, staring at a computer is usually the last thing that I want to do. So NValidate hasn't exactly been a priority of late. But things are changing, and I've been refocusing my priorities so that I can once again put some balance back into the things that are important to me. So NValidate, a project near and dear to my heart, has once again reached its proper place of prominence.

What's Happened Recently

Some time ago, shortly after the creation of the website, in fact, I had submitted NValidate to Google for inclusion in their search engine. It took what seemed like forever for that to happen. I took that, in my naiveté, to mean that they'd denied the request. What I should have considered was the massive number of requests that they probably get for sites to include in the search engine. But we've finally been included. So now, perhaps, interest in NValidate will increase, and we'll get more folks actively participating in it.

Performance Improvements

Recently, NValidate underwent a substantial rewrite of the validators for the value types. In the Alpha release, these had all been derived from a common base class (ValueTypeValidator), which provided a set of tests that were common to all value types. But the performance penalty for using inheritance for the value type validators turned out to be fairly severe.

Is and IsNot

As was noted in the Known Issues forums, these weren't fully implemented across all the classes derived from ObjectValidatorBase. I've finished this off, so that the tests are fully implemented.

New Test: IsNotNegative()

Somehow, this simple test got overlooked. I've added it to all the signed numeric types. Essentially, it throws an ArgumentOutOfRangeException exception if the parameter's value is less than zero. It should be cleaner than invoking IsGreaterThan(0).

New Tests for ArrayValidator

One of my brainier ideas was to review the source code for some of the open source projects out there, and see what some of the parameter validations were that they were doing that I might have overlooked. Arrays, apparently, are an area where I've not been doing a lot of work. (I tend to use collections more often than not, but that's likely because I come from a VB background, and only recently started working with arrays for their performance benefits.)

In reviewing some of the source code, I noticed that a few tests would be really useful. So here are a few of them:

  • HasRanks(int): Throws an ArgumentException if the array doesn't have the specified number of dimensions.
  • IsMultidimensional: Throws an ArgumentException if the array doesn't have more than one dimension.
  • IsNotMultidemnsional: Throws an ArgumentException if the array has more than one dimension.

 

Where We're Going

I'm hoping to create an entry on CodeProject or something similar, and have it point to NValidate.Org. I need to drive up community participation. Visibility is everything.

I'd also like to get more ideas for ways to improve the NValidate framework. More eyes will improve it.

The one part of the system that bugs me most is its extensibility.I'm really interested to hear what others would do to make it more extensible--how would they improve on the ability to plug in their support for unsupported tests and types without seriously degrading performance? That's a vexing problem.

Until next time!

Wednesday, October 24, 2007

Dogged Support for .NET 1.1

Technorati Tags: , ,

When I'm writing NValidate, one of the things that I am adamant about is that it must support .NET 1.1.

There are a lot of developers out there who are working for companies that have software that targets .NET 1.1. Those companies have no plans to move to .NET 2.0 any time in the near future. And so, in the short term, those 1.1 applications must be maintained. Or, new applications must be written that target 1.1, because the company chooses not to spend the money on a new development suite when they already have a working development suite in house.

I'm not casting aspersions here; I'm simply telling it like it is. I'm one of those developers who is caught in a similar scenario. So I feel the pain of those developers who want to move to a more current version of the Framework, but can't.

There is very likely a vast body of .NET 1.1 code out there undergoing constant maintenance and refactoring. To ignore those developers and their plight would be (in my eyes) cruel and inhumane. For that reason, NValidate will be compiled to target both .NET 1.1 and .NET 2.0. I'm maintaining a Visual Studio 2003 project and a Visual Studio 2005 project. Both share the same files in Visual SourceSafe. (Shoot me later about my choice of source code control software.) The point is that it's the same code, targeting two different versions of the Framework.

The .NET 2.0 version is free to evolve and support things like generics. (Thank the Powers That Be for conditional compilation.) But any code that is added that would impact both versions must behave identically and deterministically.

An interesting side effect of this is that I am not free to upgrade my computer to Windows Vista. Visual Studio 2003 doesn't run under Vista; and I not only have to be able to build the software for .NET 1.1, I have to be able to test and prove that it works properly in the IDE.

Sure, I could install a virtual machine, install Windows XP into that, install Visual Studio onto that, and save the image. Then, when I wanted to test, I could restore the image, install NValidate, test, and uninstall NValidate. But why? The simple expedient of not upgrading to Vista eliminates the problem handily.

So I will continue to do what it takes to make sure that those of us who must maintain .NET 1.1 code have full access to all the features of NValidate that it makes sense for the 1.1 version to provide.