Pages

Monday, March 18, 2013

Which ASP.NET MVC validation strategy should we use?

At least as long as programming has existed, people have tried to be tricky.

I'm sure we have all tried to architecture something tricky only to have to revert our project and sheepishly go with a simpler solution.

Whenever I look at something that isn't a tricky optimization, it's almost always a tricky way to waste cpu cycles.

The lure of trickyness we'll be looking at is validation in ASP.NET MVC.  Let's look at 3 methods of validation.

The bad way (using model objects)


Unless you want to get hacked in the same fashion github did last year, don't do the following.  You leave yourself open to getting model injected if you do this; only new MVC users will try using model objects bound to a view.  The initial lure might be that you can do this without creating an extra ViewModel.


It's not an extensible way to write code anyway so luckily you have an excuse to not be lazy.  The second you need to use the same domain object on multiple pages for even slightly different purposes, this pattern falls apart and subsequent pages will have trouble accommodating new functionality.

Add a field like this and now we have a security problem because model binding is blind and will greedily bind anything it can.
Actually the above example is prone to being insecure out of the box.  If you only wanted to let the user update their email and password, you'd have to specifically guard the Username and Id (and any new fields as I point out above).

It also shows how unclear the above code is--you may have assumed I was taking the Id and Username from a submitted field even though this was for a profile editing controller.  It's even less readable.


The tricky way (using centralized validation close to your model)


Okay, fine we'll use ViewModels, but we're going to be tricky.  We're going to keep all of our validation inside our domain objects.  When someone goes to input some data, we'll map to our domain objects, check the validation rules and then bubble the results up.  It's genius.

Actually it's bad.  For several reasons.  Exceptions to rules will be painful.

Let's not say things we don't mean and have to take them back


It also doesn't take advantage of the model state of MVC which I find to be quite nice and allows you to check errors separately.


  • Once for easy validations like max length and whether a field is required, 
  • Again for dynamic validations that involve costly database operations.


Bundling those two operations isn't good.  Without a simple way to perform client-side checks, that's what you'll be doing (I would just bundle them on the server if you have jQuery validate on the client for the sake of saving a few lines of code in each controller).  .

I don't care what horrid path Visual Studio Magazine (scroll to 'Integrating annotations') is trying to send you down.  Don't do the tricky way.  It's cute little examples like this have people being tricky.  But then you feed tricky after midnight and bam...code gremlins.  They even seem to recommend 'The bad way'.

This pattern could work in a fully back-end scenario, but this next pattern is just too durable, efficient and fast for anything else to be used for ASP.NET MVC development.

The smart way (also the easiest way -- using ViewModels and validation attributes)



A good example.  Simple and gets created in the default MVC4 project template.


But author, creating a view model AND validation attributes for each form is more typing.  How is it the easiest way?  Because ordering your thoughts is hard.  Not being able to trigger your validation on the client, is hard.  Having to mess with your DbContext just validate user input (noise++ to your codebase) is hard.  Having to make an exception to what you thought were your domain rules becomes a hack (I'll remove this one validation rule..)

Let me be clear about that last point.  Adding something and then removing it later down the pipeline before you do any real work is a HUGE anti-pattern.  I don't know this anti-pattern by a name, but this is one of the biggest mistakes I see in code.  It's unclear, it's a hack, it's inefficient, it shouldn't be done.


Here is the article to read that will explain the best practices of the ViewModel pattern.   (Although the author doesn't mention that your HTML will be rendered with jQuery validate baked into it which is a big plus)


It's worth noting that this works great with ajax heavy sites.  I almost exclusively use ajax to submit forms and working with data annotation attributes is still preferred.  Data annotation validators will cover any  non-dynamic validations you need.  The jQuery validate attributes will be rendered in your HTML which you can easily call out to yourself (a huge benefit over the previous example--an easy way to make super responsive pages by having automatic client side validation.)

The ViewModel pattern solves so many issues.  It's not completely D.R.Y., though.

Or is it?

If you have two pages that allow modification of a profile (lets say, registration and profile updating) you will have to copy and paste your name length validator attribute to both view models.  I would say that attributes are data and not code (we've been blurring the line lately.. look at expressions) and that the code is still D.R.Y.

Conclusion

Most MVC developers already know, and practice, everything I've said above.  So why would I spend my time writing about it?   Because why not examine the other avenues and possible architectures even if they end up being poor choices?  If you ever felt like maybe you were missing something or you were wasting time when writing a bunch of ViewModels and repeating attributes on them, well, you should feel a bit more justified after reading this.  It's the standard--keep doing it.

Also, hasn't everyone tried conquering validation with a tricky pattern at some point?  I think the answer is probably yes.  But at the end of the day, being as specific as you can and failing as soon as you can (Client passes -> server side checks passes -> dynamic server side checks pass -> okay go), is always going to be king and it does not facilitate tricky patterns.

This post may leave you thinking about unit testing as much as possible because there are some maintainability considerations with validations attributes on ViewModels even if this even if it is the best way to write your MVC application.  What do you think?

Monday, March 4, 2013

LINQKit is good and more people should use it

If I had to pick one library that is criminally underused in the C#/.NET world, it would be LINQKit.  A library for making it easier to manipulate and work with expressions (most often used in conjunction with LINQ).

I think part of the reason it's underused is because it's basically 100% complete, a few years old, and you can easily live without it; though I'll try to convince you that you don't want to.  LINQKit is boring because it does one thing and its been doing it for years already.  But it is horribly underused from what I've seen.

LINQKit is has become almost an automatic addition to .NET projects I start.  Most of the time I start a medium sized or bigger C# project, I 'install-package LINQKit'

Which helps bring me to my next point.  I'd make the bold statement that if you have an app that's >50,000 lines of code and it's heavy on data access/Entity Framework/lambda expressions, chances are your codebase could be cleaned up and DRYed out if you used LINQKit.  I decided to write this article because  few people seem to use it and I have a friend working on a commercial project that would be cleaned up by LINQKit because it's very expression heavy (unfortunately one of the project requirements is no 3rd party libraries).  It has me wondering how many coders are unaware of this awesome library.


Only 13,000 NuGet downloads.  1.2 Million for Entity Framework!

I considered making a code example for this post but decided against it.  There are many of short examples that will illustrate the usefulness to you.

Here are a few links with convincing examples of LinqKit usage

Friday, March 1, 2013

A brief Entity Framework Codebase Overview

Open source enterprise projects are a funny thing.  Enterprise software leaks seem to get read more often than enterprise open source software; once you willingly release your software, people seem to stop being infatuated with it unless they plan to modify the code.

Few people seem to actually read a portion of it just for the sake of reading it; especially on massive projects.

I couldn't find a single article that gave an overview of what's going on inside the Entity Framework project so I'm hoping to give some objective insight on what I found and point out some of the more interesting things along the way.

Solution Overview

(As of commit a88645f8581c)
Visual Studio 2012
Language : C# with some tests in VB.NET
Lines of Code : 188,547 (145,126 not counting tests)
Projects : 10


Test Projects : 4 (2 Functional, 1 Unit, 1 VB.NET)
Test Framework : xUnit (No test initialize supported or test class constructors used)

Solution Build Time : 29.26 seconds

Test count, run time


Unit : 4713 tests ran in 233.59 seconds
Functional : 3541 tests ran in 822.97 seconds
Functional.Transitional : 1865 tests ran in 344.25 seconds
VB.NET : 47 tests, ran in 6.28 seconds

(All Done on quad core i7 3.4 GHz with TestDriven.Net)

FxCop rules Standards : Microsoft Managed Recommended rules and a custom ruleset for the EntityFramework assembly.
FxCop Rules suppressions : 2,345

Some Code Overview


How are are the nuances of every different SQL server version handled?  This was one of my biggest curiosities when I opened this project.  Take a second to postulate.

The answer?  Lots of inline version checks and great test coverage.  Here's some code that has to wrangle multiple versions of SQL server.

Fun fact : SQL Server 2008 is still referred to as its codename "Katmai" in comments and method names all over the codebase.  Same goes for SQL Server 2005 "Yukon"

Here are the accompanying tests for the above code that cover all SQL server versions.  (Interesting that they chose underscores to separate words in test names over camel casing.)


Part of the coding standard is to use 'var' in local variable declaration wherever possible. Not uncommon and apart of my standards as well.


I noticed some copied and pasted classes even when both copied classes are in projects that have a common dependency on the EntityFramework assembly.  This is likely no problem for a team with rigorous reviewing standards but it could be a bit nicer. Don't take this  as "The entire codebase is copied and pasted"; it's really just something that piqued my interest.



The new Analyze Solution for Code Clones functionality in Visual Studio 2012 rocks, by the way.  It's still not as good or as fast as Atomiq (pictured above) but it's worth trying out.

Conclusions 


The code is extremely well written.  The test coverage is fantastic and the entire codebase looks like it was coded by a single super-competent programmer with how consistent the coding standard is.  Interesting that they went with xUnit and didn't put any code in test constructors (xUnit does not support test initialization so if you're using it with constructors, you may as well just use nUnit). It seems to make a lot of sense with how big some of the test classes got.

The 2000+ suppressed rules feels like overkill (a good number are targeted at defending variable spelling) It feels counterproductive to say "We are going to adhere to this ruleset" and then make 2,345 exceptions.

Taking suppression fire
A good number of these without the justification parameter filled in.  Finding an instance where a catch all exception handler wasn't explained in comment or justification is discouraging.  You will see the most rigorous FxCop ruleset applied in many Microsoft open source projects.

I hope this gets people excited about a great project and supplied you with a very brief introduction.  I may do a part two that actually goes a bit more in depth if there is an interest.