LiveLinq Query Performance: Tuning Indexing Performance

Query speedup achieved on any particular query by LiveLinq using indexes for optimization depends on how that query is written. This dependence is usually not dramatic. LiveLinq does a fair job of recognizing opportunities to use indexes for optimization regardless of the way the query is written. For example, it will still execute a query efficiently using indexes even if the condition consists of multiple terms connected with logical operators.

If you want to ensure that your indexes are used effectively by LiveLinq, consider the following guidelines:

Elementary predicates that can benefit from indexing

LiveLinq can use an index by a property P in an elementary predicate (a condition without boolean operations) that has one of the forms (patterns) (1) – (6) listed below. The simplest and perhaps the most common of such elementary predicates is a where condition with an equality involving a property, as in

from x in X where x.P == 1 select x

It will use the index by x.P provided that such index exists, which, as described in How to create indexes, can be ensured either by creating this index explicitly

X.Indexes.Add(x => x.P);

or by using the Indexed() hint:

from x in X where x.P.Indexed() == 1 select x

In fact, LiveLinq supports a more general indexing. It does not necessarily have to be a property of x, it can be any expression depending on x (and only on x). This can come in handy, for example, if we are querying an untyped ADO.NET DataTable, so, because it is untyped, it does not have properties that we can index and query by their names and we need to use a query like this:

from c in customersTable.AsIndexed() where c.Field<string>("CustomerID") == "ALFKI"select x

Then we can use an index by the following expression:

X.Indexes.Add(c => c.Field<string>("CustomerID"));

We will quote only the most common case of a simple property, x.P in the list below, but keep in mind that it can be replaced everywhere with any expression depending on x (and only on x).

Following is the list of patterns recognized by LiveLinq as allowing optimization with indexes:

(1) Equality:

x.P == Const (Const is a value constant in the given query operator, which means that it can be an expression depending on external parameters and variables, but it must have the same value for all elements that are tested by this where condition).

Example:

from o in Orders.AsIndexed() where o.OrderID.Indexed() == 10011

(2) Inequality:

x.P op Const, where op is one of the comparison operators: >, >=, <, <=

Example:

from o in Orders.AsIndexed() where o.OrderID.Indexed() > 10011

(3) Left and right parts of an equality or inequality are interchangeable (commutative).

Example: The following query will use indexing exactly as the one in (1):

from o in Orders.AsIndexed() where10011 == o.OrderID.Indexed()

(4) StartsWith:

x.P.StartsWith(Const), if x.P is of type string.

Example:

from o in Orders.AsIndexed() where o.CustomerID.StartsWith("A")

(5) Belongs to (in, is an element of):

ConstColl.Contains(x.P), if ConstColl implements IEnumerable<T> where T is the type of x.P.

Example:

from o in Orders.AsIndexed()
where (new int[]{"ALFKI", "ANATR", "ANTON"}).Contains(o.CustomerID)

(6) Year comparison:

x.P.Year op Const, where x.P is of type DateTime and op is any of the comparison operators ==, >, >=, <, <=

Example:

from o in Orders.AsIndexed()
where o.Date.Indexed().Year == 2008

Other elementary predicates don't use indexes. Examples where indexing will not be used:

from o in Orders.AsIndexed() where o.Freight > 10

(if there is no index defined for the Freight property)

from o in Orders.AsIndexed() where o.OrderID.Indexed() < o.Freight

(comparison must be with a constant, not with a variable value)

from o in Orders.AsIndexed() where o.OrderID.Indexed() != 10011

(only certain comparisons are used, ! (not equal) is not one of them)

Boolean operators

Conjunction (&&) and disjunction (||) are handled by LiveLinq optimizer, including their arbitrary combinations and including possible parentheses. Other boolean operators, such as negation (!) are not handled by the optimizer; they block its use of indexing.

Conjunction:

Conjunction does not prevent the use of indexes. For example,

from o in Orders.AsIndexed() where o.Freight > 10 && o.Lines.Count > 5

will use the index by Freight property (supposing such index exists) even though the second condition cannot use indexes. LiveLinq will simply check the other condition for each item that is found using the index from the first condition.

Conjunction of conditions using the same property

Moreover, conjunctions of conditions with the same property will be optimized to render the best execution plan. For example, if the Freight property is indexed,

from o in Orders.AsIndexed() where o.Freight > 10 && o.Freight < 20

will not go through all orders with Freight > 10 and check if it is less than 20 for each of them but will use the index to go directly to the orders between 10 and 20 and won't check any other orders.

Conjunction of conditions with different properties: where subindexes come into play

Conjunctions using different properties, for example

from o in Orders.AsIndexed() where o.CustomerID == "ALFKI" && o.Freight < 20

can utilize subindexes, see Subindex(T, TKey) Class. If the index by CustomerID has a subindex by Freight, LiveLinq will not check all orders with CustomerID equal to "ALFKI" (which can be quite numerous). It will go directly to the subindex corresponding to the value "ALFKI" (only one such subindex exists, and it can be accessed directly, without any search) and enumerate the items in that subindex whose Freight is less than 20. Both operations (finding a subindex and enumerating a range of items in it) are direct access operators, without search, so the query is executed in the fastest way possible, without spending any time on checking items that don't contribute to the result.

Disjunction:

(a) For an indexed property x.P,

x.P == Const1 || x.P == Const2

is handled by re-writing the condition as

(new ...[]{Const1, Const2}).Contains(x.P)

(b) If two different properties x.P and x.Q are indexed, then

x.P op Const1 || x.Q op Const2 (op is ==, <, >, <=, etc)

is handled by re-writing the query as a union of two separate queries with corresponding elementary conditions.

Join

If either of the two key selectors in a join, that is, either x.K or y.K in

from x in X

join y in Y on x.K equals y.K

is indexed, LiveLinq will use that index to perform the join.

Ideally, both x.K and y.K are indexed, and then LiveLinq will execute the join in the fastest way possible (using the so called merge join algorithm, which is very fast; basically, it takes the same time to build as to traverse the result of such join, there is no time lost on anything).

If there are no indexes on x.K or y.K, LiveLinq will still try to find the best algorithm for the join, which is sometimes merge join (if both parts of the join are ordered by the merge key) and sometimes the hash join algorithm used by the standard LINQ to Objects.

As a general guideline, it should be noted that defining indexes only for optimizing Join operations in your query will rarely lead to dramatic speedups. That's because the hash join algorithm (the one used in the standard LINQ, it does not need indexes) is already quite fast. It does make sense to define indexes for join keys sometimes, but consider it when you are fine-tuning your query, not when you need the first quick and dramatic speedup. For dramatic speedups, first consider optimizing your Where conditions, that can speed up your query often hundreds and even thousands times (depending on the query, of course).