As a follow-up to my last post about accessing synonyms via a linked server (you can’t), I stated that it was therefore better to use a view. However, that has issues of its own to consider.
Nothing’s ever easy, you can’t access synonyms via a linked server and therefore you want to use views, but they are just asking for trouble as well because SQL Server can’t gauge statistics on them.
We’ve always had issues in SQL Server with linked servers and statistics because SQL Server historically had no access to the stats of objects on the linked server unless you were using a high privilege account such as db_owner or sysadmin. However, this changed in SQL 2012 SP1 in which they opened up the stats so that all you needed was db_datareader and you could see the stats of the remote table you were accessing.
However, what happens if you’re accessing a view?
Let’s create a simple table and populate it on my TRACYSQL lab cluster:
set nocount on
go
create database testDB
go
use testDB
go
create table testRealTable
(
id int identity
)
go
insert into testRealTable default values
go 100
Now let’s create a view on my ROBINSQL lab cluster to access this:
use testDB
go
create view testRealView
as
select *
from TracySQL.testDB.dbo.testRealTable
go
That was simple enough.
Now, my linked server accounts are both sysadmin and therefore we have more than we need to see stats within my SQL 2014 environment… therefore let’s see what execution plan we get when trying to read some data:
select * from RobinSQL.testDB.dbo.testRealView
All as expected… so what about the estimated number of rows?
It’s estimating 10,000? We only have 100 rows? Well that’s actually the default for a remote query in SQL Server… so why is it using that when we have access to see the stats?
Well the problem is that we have access to the stats of objects, and therefore SQL Server asks for the stats on the object we’re querying. But we’re querying a view and the view itself has no stats and therefore it returns a null which SQL then translates into its defaults.
The linked server doesn’t break down the view and look at the underlying tables and then return those stats and therefore if you’re thinking of accessing a view via a linked server then be VERY careful and test your queries carefully because I’ve seen tables in excess of 100 million rows and 300GB of data being seen as “10,000 rows” and therefore SQL Server thinks it’s cheaper to execute a cross server query locally and so tries to pull VERY large tables across the network (thinking they’re small) which is an absolute performance killer.
You have been warned.