A brief Entity Framework Codebase Overview

Open source enterprise projects are a funny thing. Enterprise software leaks seem to get read more often than enterprise open source software; once you willingly release your software, people seem to stop being infatuated with it unless they plan to modify the code.

Few people seem to actually read a portion of it just for the sake of reading it; especially on massive projects.

I couldn't find a single article that gave an overview of what's going on inside the Entity Framework project so I'm hoping to give some objective insight on what I found and point out some of the more interesting things along the way.

Solution Overview

(As of commit a88645f8581c)

Visual Studio 2012
Language : C# with some tests in VB.NET
Lines of Code : 188,547 (145,126 not counting tests)
Projects : 10

Test Projects : 4 (2 Functional, 1 Unit, 1 VB.NET)
Test Framework : xUnit (No test initialize supported or test class constructors used)

Solution Build Time : 29.26 seconds

Test count, run time

Unit : 4713 tests ran in 233.59 seconds
Functional : 3541 tests ran in 822.97 seconds
Functional.Transitional : 1865 tests ran in 344.25 seconds
VB.NET : 47 tests, ran in 6.28 seconds

(All Done on quad core i7 3.4 GHz with TestDriven.Net)

FxCop rules Standards : Microsoft Managed Recommended rules and a custom ruleset for the EntityFramework assembly.
FxCop Rules suppressions : 2,345

Some Code Overview

How are are the nuances of every different SQL server version handled? This was one of my biggest curiosities when I opened this project. Take a second to postulate.

The answer? Lots of inline version checks and great test coverage. Here's some code that has to wrangle multiple versions of SQL server.

Fun fact : SQL Server 2008 is still referred to as its codename "Katmai" in comments and method names all over the codebase. Same goes for SQL Server 2005 "Yukon"

Here are the accompanying tests for the above code that cover all SQL server versions. (Interesting that they chose underscores to separate words in test names over camel casing.)

Part of the coding standard is to use 'var' in local variable declaration wherever possible. Not uncommon and apart of my standards as well.

I noticed some copied and pasted classes even when both copied classes are in projects that have a common dependency on the EntityFramework assembly. This is likely no problem for a team with rigorous reviewing standards but it could be a bit nicer. Don't take this as "The entire codebase is copied and pasted"; it's really just something that piqued my interest.

The new Analyze Solution for Code Clones functionality in Visual Studio 2012 rocks, by the way. It's still not as good or as fast as Atomiq (pictured above) but it's worth trying out.

Conclusions

The code is extremely well written. The test coverage is fantastic and the entire codebase looks like it was coded by a single super-competent programmer with how consistent the coding standard is. Interesting that they went with xUnit and didn't put any code in test constructors (xUnit does not support test initialization so if you're using it with constructors, you may as well just use nUnit). It seems to make a lot of sense with how big some of the test classes got.

The 2000+ suppressed rules feels like overkill (a good number are targeted at defending variable spelling) It feels counterproductive to say "We are going to adhere to this ruleset" and then make 2,345 exceptions.

Taking suppression fire

A good number of these without the justification parameter filled in. Finding an instance where a catch all exception handler wasn't explained in comment or justification is discouraging. You will see the most rigorous FxCop ruleset applied in many Microsoft open source projects.

I hope this gets people excited about a great project and supplied you with a very brief introduction. I may do a part two that actually goes a bit more in depth if there is an interest.

New Modifier - .NET/C# information and Tutorials

Pages

Friday, March 1, 2013