Wednesday, June 05, 2013

Projecting to DB entity types in LINQ to SQL

While updating an ASP.NET MVC view for a rather simple list of DB based content items coming from a LINQ-to-SQL query, I realized a small optimization was in order.  I wanted to trim it down to only select the necessary columns used by the particular MVC action in which the query was used.  The query was returning the native DB context type based on the table schema and I didn't want to change that (possibly more for the exercise of it than the outcome; although, avoiding class explosion was a legitimate reason).  Instead, I just wanted fewer of the properties populated.

Here's the original code:

  var contentItems = BmsData.Contents.ToList();

and the resulting SQL command:

  SELECT [t0].[ContentNavID], [t0].[ParentTocID], 
    [t0].[PrevContentNavID], [t0].[NextContentNavID], 
    [t0].[Title], [t0].[ViewerContent], [t0].[CreateDate]
  FROM [content].[Content] AS [t0]

Pretty simple, just the whole table as a list of 'Content'.  To trim this down, I tried this:

  var contentItems = BmsData.Contents
    .Select(x => new Content() {
      Title = x.Title, ContentNavID = x.ContentNavID
    }).ToList();

This compiled.  However, when I ran it I got this exception:

  Explicit construction of entity type 'CL.eCourse.BMS.Data.DataLayer.Content' in query is not allowed.

Apparently, it's a no-no to new-up a type defined by the generated data context.  The reasoning has to do with "polluting the context cache" as well as things like object tracking and identify field updates.  While the justification makes sense, it would still be rather helpful to be able to get a subset of the fields based on what I know I'm going to need.

Some searching led to a solution involving a function delegate:

  Func makeContent = 
    x => new Content() { 
      ContentNavID = x.ContentNavID, Title = x.Title
    };

  var contentItems = BmsData.Contents.Select(x => makeContent(x)).ToList();

Again, this compiles and runs, but it doesn't do what we want.  Since the expression delegates to a method, the expression can't be parsed to translate it into the query components, finding only the desired fields in the selector.  It seems that the compiler just runs the query for the full select of the entity, then calls the delegate for each one.  Indeed, a look at the SQL executed by the data context proves this:

  SELECT [t0].[ContentNavID], [t0].[ParentTocID], 
    [t0].[PrevContentNavID], [t0].[NextContentNavID], 
    [t0].[Title], [t0].[ViewerContent], [t0].[CreateDate]
  FROM [content].[Content] AS [t0]

Here we see the same SQL as above, with all the columns for the 'Content' data entity, when we only wanted to see 'Title' and 'ContentNavID'.  Since we can't new-up a DB entity type, and we can't delegate the expression the way I tried (there's probably a more complicated way to using the Expression<> type) the last option was to use an anonymous type, then re-select as the entity type.  So we first select the properties we care about using an anonymous type, then explicitly force it into an Enumerable (which will execute the query projection to the anonymous type), then re-select it into the DB entity type we originally had.

  var contentItems = BmsData.Contents
    .Select(x => new { x.Title, x.ContentNavID })
    .AsEnumerable()
    .Select(x => new Content() {
      Title = x.Title, ContentNavID = x.ContentNav
    });

This yields the following changed SQL query:

  SELECT [t0].[Title], [t0].[ContentNavID]
  FROM [content].[Content] AS [t0]

And we have the optimization we were looking for!

Thanks to Tommy Etkin for his help with this.