Here are ten root causes of the most common
misunderstandings—distilled from many hundreds of
questions on the
LINQ forums.
There are two kinds of syntax for queries: lambda syntax and query syntax (or query comprehension syntax). Here's an example of lambda syntax:
Here's the same thing expressed in query syntax:
Logically, the compiler translates query syntax into lambda syntax. This means that everything that can be expressed in query syntax can also be expressed in lambda syntax. Query syntax can be a lot simpler, though, with queries that involve more than one range variable. (In this example, we used just a single range variable, p, so the two syntaxes were similarly simple).
Not all operators are supported in query syntax, so the two syntax styles are complementary. For the best of both worlds, you can mix query styles in a single statement (see Myth #5 for an example).
The expression:
is a frivolous query! You can simply go:
Similarly, the following LINQ to XML query:
can be simplified to:
And this:
can be simplified to:
Myth #1
All LINQ queries must start with the ‘var’ keyword. In fact,
the very purpose of the ‘var’ keyword is to start a LINQ query!
The var keyword and LINQ queries are separate concepts. The purpose of
var is to let the compiler guess what type you want for a local variable
declaration (implicit typing).
For example, the following:
is precisely equivalent to:
because the compiler infers that s is a string.
Similarly, the following query:
is precisely equivalent to:
You can see here that all that we're achieving with var is to abbreviate IEnumerable<string>. Some people like this because it cuts clutter; others argue that implicit typing can make it less clear what's going on.
Now, there are times when a LINQ query necessitates the use of var. This is when projecting an anonymous type:
Here is an example of using an anonymous type outside the context of
LINQ query:
For example, the following:
var s = "Hello";
is precisely equivalent to:
string s = "Hello";
because the compiler infers that s is a string.
Similarly, the following query:
string[] people = new [] { "Tom", "Dick", "Harry" }; var filteredPeople = people.Where (p => p.Length > 3);
is precisely equivalent to:
string[] people = new [] { "Tom", "Dick", "Harry" }; IEnumerable<string> filteredPeople = people.Where (p => p.Length > 3);
You can see here that all that we're achieving with var is to abbreviate IEnumerable<string>. Some people like this because it cuts clutter; others argue that implicit typing can make it less clear what's going on.
Now, there are times when a LINQ query necessitates the use of var. This is when projecting an anonymous type:
string[] people = new [] { "Tom", "Dick", "Harry" }; var filteredPeople = people.Select (p => new { Name = p, p.Length });
Here is an example of using an anonymous type outside the context of
LINQ query:
var person = new { Name="Foo", Length=3 };
Myth #2
All LINQ queries must use query syntax.
There are two kinds of syntax for queries: lambda syntax and query syntax (or query comprehension syntax). Here's an example of lambda syntax:
string[] people = new [] { "Tom", "Dick", "Harry" }; var filteredPeople = people.Where (p => p.Length > 3);
Here's the same thing expressed in query syntax:
string[] people = new [] { "Tom", "Dick", "Harry" }; var filteredPeople = from p in people where p.Length > 3 select p;
Logically, the compiler translates query syntax into lambda syntax. This means that everything that can be expressed in query syntax can also be expressed in lambda syntax. Query syntax can be a lot simpler, though, with queries that involve more than one range variable. (In this example, we used just a single range variable, p, so the two syntaxes were similarly simple).
Not all operators are supported in query syntax, so the two syntax styles are complementary. For the best of both worlds, you can mix query styles in a single statement (see Myth #5 for an example).
Myth #3
To retrieve all customers from the customer table, you must perform a query
similar to the following:
var query = from c in db.Customers select c;
The expression:
from c in db.Customers select c
is a frivolous query! You can simply go:
db.Customers
Similarly, the following LINQ to XML query:
var xe = from e in myXDocument.Descendants ("phone") select e;
can be simplified to:
var xe = myXDocument.Descendants ("phone");
And this:
Customer customer = (from c in db.Customers where c.ID == 123 select c) .Single();
can be simplified to:
Customer customer = db.Customers.Single (c => c.ID == 123);
Myth #4
To reproduce a SQL query in LINQ, you must make the LINQ query look as
similar as possible to the SQL query.
LINQ and SQL are different languages that employ very
different concepts.
Possibly the biggest barrier in becoming productive with LINQ is the "thinking in SQL" syndrome: mentally formulating your queries in SQL and then transliterating them into LINQ. The result is that you're constantly fighting the API!
Once you start thinking directly in LINQ, your queries will often bear little resemblance to their SQL counterparts. In many cases, they'll be radically simpler, too.
Possibly the biggest barrier in becoming productive with LINQ is the "thinking in SQL" syndrome: mentally formulating your queries in SQL and then transliterating them into LINQ. The result is that you're constantly fighting the API!
Once you start thinking directly in LINQ, your queries will often bear little resemblance to their SQL counterparts. In many cases, they'll be radically simpler, too.
Myth #5
To do joins efficiently in LINQ, you must use the join keyword.
This is true, but only when querying local
collections. When querying a database, the join keyword is completely
unnecessary: all ad-hoc joins can be accomplished using multiple from
clauses and subqueries. Multiple from clauses and subqueries are
more versatile too: you can also perform
non-equi-joins.
Better still, in LINQ to SQL and Entity Framework, you can query association properties, alleviating the need to join altogether! For instance, here's how to retrieve the names and IDs of all customers who have made no purchases:
Or, to retrieve customers who have made no purchases over $1000:
Better still, in LINQ to SQL and Entity Framework, you can query association properties, alleviating the need to join altogether! For instance, here's how to retrieve the names and IDs of all customers who have made no purchases:
from c in db.Customers where !c.Purchases.Any() select new { c.ID, c.Name }
Or, to retrieve customers who have made no purchases over $1000:
from c in db.Customers where !c.Purchases.Any (p => p.Price > 1000) select new { c.ID, c.Name }
Notice that we're mixing fluent and query syntax. See LINQPad for more examples on association properties, manual joins, and mixed-syntax queries.
Myth #6
Because SQL emits flat result sets, LINQ queries
must be structured to emit flat result sets, too.
This is a consequence of Myth #4. One of LINQ's big
benefits is that you can:
The hierarchical result from this query is much easier to work with than a flat result set!
We can achieve the same result without association properties as follows:
- Query a structured object graph through association properties (rather than having to manually join)
- Project directly into object hierarchies
from c in db.Customers where c.State == "WA" select new { c.Name, c.Purchases // An EntitySet (collection) }
The hierarchical result from this query is much easier to work with than a flat result set!
We can achieve the same result without association properties as follows:
from c in db.Customers where c.State == "WA" select new { c.Name, Purchases = db.Purchases.Where (p => p.CustomerID == c.ID) }
Myth #7
To do outer joins in LINQ to SQL, you must always use DefaultIfEmpty().
This is true only if you want a flat result set.
The examples in the preceding myth, for instance, translate to a left outer join in
SQL, and require no DefaultIfEmpty operator.
Myth #8
A LINQ to SQL or EF query will be executed in one round-trip only if the query
was built in a single step.
LINQ follows a lazy evaluation model, which means queries
execute not when constructed, but when enumerated. This means
you can build up a query in as many steps as you like, and it won't actually
hit the server until you eventually start consuming the results.
For instance, the following query retrieves the names of all customers whose name starts with the letter 'A', and who have made at least two purchases. We build this query in three steps:
For instance, the following query retrieves the names of all customers whose name starts with the letter 'A', and who have made at least two purchases. We build this query in three steps:
var query = db.Customers.Where (c => c.Name.StartsWith ("A")); query = query.Where (c => c.Purchases.Count() >= 2); var result = query.Select (c => c.Name); foreach (string name in result) // Only now is the query executed! Console.WriteLine (name);
Myth #9
A method cannot return a query, if the query ends in the 'new'
operator
The trick is to project into an ordinary named type
with
an object initializer:
NameDetails is a class that you'd define as follows:
public IQueryable<NameDetails> GetCustomerNamesInState (string state) { return from c in Customer where c.State == state select new NameDetails { FirstName = c.FirstName, LastName = c.LastName }; }
NameDetails is a class that you'd define as follows:
public class NameDetails { public string FirstName, LastName; }
Myth #10
The best way to use LINQ to SQL is to instantiate a
single DataContext to a static property, and use that shared instance
for the life of the application.
This strategy will result in stale data, because objects
tracked by a DataContext instance are not refreshed simply by requerying.
Using a single static DataContext instance in the middle tier of a distributed application will cause further trouble, because DataContext instances are not thread-safe.
The correct approach is to instantiate fresh DataContext objects as required, keeping DataContext instances fairly short-lived. The same applies with Entity Framework.
Using a single static DataContext instance in the middle tier of a distributed application will cause further trouble, because DataContext instances are not thread-safe.
The correct approach is to instantiate fresh DataContext objects as required, keeping DataContext instances fairly short-lived. The same applies with Entity Framework.
No comments :
Post a Comment