Many to many relationships in LINQ-to-SQL

LINQ is a new component in .NET 3.5 that has a lot of .NET developers very excited. It provides a simple and flexible method of querying object-encapsulated data. The LINQ to SQL provider in particular has been getting a lot of attention since it can be used to query SQL Server databases using the same simple syntax that is used to query collections in memory - and furthermore can send updates made to the data back to the database. This feature, combined with very slick integration into Visual Studio 2008, provides a very strong motivation for developers to make the leap to .NET 3.5.

Unfortunatley, LINQ to SQL has a few limitations, the most obvious of which is the inability to directly handle the common representation of a many-to-many relationship. In this series, I look at the issue and some of the ways of circumventing it.

Let's look at this simple example from a simplified version of the Northwind database:

A simple schema representing a many-to-many relationship between Orders and Products

Since LINQ defines only one-to-one and one-to-many relationships, you end up with two classes in your object model representing the two participating domain objects (Order and Product), and a third representing the relationship between them. Often we'll want to write something like:

foreach (Product p in order.Products) { ...

- as we would with a one-to-many relationship - but, since there isn't a Products property of the Order object, we need to write:

foreach (Product p in order.Order_Details.Select(od => od.Product)) { ...

- which isn't as clear as we'd like it to be.

In this simple case, we are able to alter a partial implementation of the Order class (right click on it in the design view and select "View Code"). To this class declaration we can add the following code:

public IEnumerable Products
{
    get { return Order_Details.Select(od => od.Product); }
}

This is ideal for most purposes. We can now write foreach (Product p in order.Products) whenever we need to iterate through all the products in an order. We'll need to add code to the Products class as well if we need to query all the orders that include a certain product, but that's not too much hassle.

This solution is so simple that a lot of users are asking "Why doesn't LINQ do this already?". That's a good question. The use of a junction table containing nothing but two foreign keys (or sometimes two foreign keys plus a unique ID for the relationship) to represent a many-to-many relationship is practically ubiquitous. Why would a development tool as paradigm-shifting as LINQ omit this simple feature?

There's a simple answer: Not all many-to-many relationships have a junction table as simple as the one above. What if we need to track some property of the relationship other than the two participating entities?

A simple schema representing a many-to-many relationship between Orders and Products, including a property of the relationship - Quantity

As you can see, this example is a little less forgiving. We still want to be able to write foreach (Product p in order.Products), but what about the Quantity column of the Order_Detail table?

The problem is that the quantity of a product associated with an order is neither a property of the product, nor a property of the order. It's a property of the relationship between the product and the order. We could define a new type (possibly an anonymous type in C#) to contain all the fields of Product, plus all the relevant properties of the relationship (and another new type if we need to handle the relationship the other way around), but that doesn't really sound like a stellar plan. A Product is a Product, and having different types for dataContext.Products and order.Products isn't very productive.

In addition, having order.Products return an IEnumerator might let us perform our foreach, but it robs us of a number of other LINQ features - most notably the ability to handle inserts and deletes in the object model. What if we want to add a product to an order?

The relational model is very expressive, and it is important that our tools for interacting with it are no less expressive. By automating the simple case, we can see that we lose a great deal when the schema is not so simple.

That isn't to say that the simple solution I outlined above isn't useful. Being able to DataBind order.Products with a few lines of code is really handy - but you'll need to delve deeper to handle more complicated cases.

I'll be looking at some more generic solutions to this problem - including preserving access to the properties of the junction table, and handling inserts/updates/deletes - next time.

Next Article: Many to many relationships in LINQ-to-SQL (part 2) >

Your rating: None Average: 5 (1 vote)
Very nice article

Iain - thanks for sharing your ideas - very thoughtful insight!
U've pointed out some very good moments here

Submitted by Roger (not verified) on 10 July, 2008 - 11:16.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <pre>
  • Lines and paragraphs break automatically.

More information about formatting options

Iain has finished the crunchy bits and is now perfecting the flavours and the wrapping.

Powered by Drupal, an open source content management system