Thursday, May 21, 2009

Stringy stringy bang bang

I had a simple scenario: Array of strings bound to a GridView. Initially, with auto generation of columns, this is easy. You get a single column with all the strings in the array. But then I needed to do something practical with it, so thus needed to specify the columns:

<Columns>
<asp:BoundField DataField="?????" />
...
</Columns>

I noticed that the column header said "Item" when auto generated so I tried that, but it didn't work. So then, I ask myself, what is the data field of a System.String? The only bindable member (public field or property) is .Length. Good to know but useless in this case. The ToString() member is a method so it can't be bound to. But apparently there is an undocumented trick: Use the exclamation point (!) as the datafield value and you'll get the string's value.

<asp:BoundField DataField="!" />

This yields the desired result.

Thursday, April 23, 2009

Understanding MS SQL Server Named Instance Connections

I had had several discussions in the past with a systems engineer about why the MS SQL Server named instances he had set up in various environments were using non-standard ports. I explained that the whole idea of named instances was to eliminate that need. He couldn't explain why he needed non-standard ports, just that he did. Of course, that didn't satisfy my curiosity but I dismissed it until now.

I was recently setting up several new instances of SQL Server on a machine in our data center that already has a default instance on it. As such, I created all the new ones as named instances. All was well and good. I connected to the new instances from the local machine (presumably using the default shared memory connection) and continued configuring the instances including enabling the TCP/IP protocol. I did all of this work via remote desktop on the machine itself.

Then it came time to connect to these new instances from my local environment through the corporate network firewall(s) using SQL management studio. The existing instance connected fine. But none of the new named instances would connect. I was puzzled and irritated. Obviously, the standard SQL port (1433) was open through the firewall
to that box, as I could connect to the default instance. So what was going on?

After a bit of reading, digging further into the instance configurations and a conversation with a co-worker, I learned about dynamic ports. Here's how it works:

When you connect to a server using a default instance configuration (e.g. "myserver"), the SQL client calls the server on the standard port: 1433. In my case, this was fine, the port was open through the firewall. When you call a server using a named instance but lacking a port (e.g. "myserver\instancename"), the SQL client calls SQL Server's management service on port 1434. The management service replies with a dynamically assigned port (chosen at sql engine service startup) for the instance and the client connection proceeds using that new port. In my case, the network firewall doesn't have either the management service port nor the dynamic port(s) open. So my connections failed.

Using dynamic ports for SQL instances basically means that the instance names are like a SQL DNS system. They are simply there to help the server resolve a dynamic port. Once the port is resolved, the name is superfluous.

This brought me to the actual reason that the systems engineer had to use additional, and thus non-standard, ports for the additional SQL instances. He needed to have known ports so the firewall could be configured to allow them.

On further investigation I learned that once you have a predefined port assigned to a named instance you no longer need the actual instance name for the connection. You can simply connect using the server name (or IP) and port number, for example: "myserver,1432". Also, once you are connecting using predefined port(s) you no longer
need to run the SQL browser service on the server that provides the instance name lookup, thus freeing up a few system resources.

So the simple solution to my original problem was to assigned specific ports to the various instances on the machine. It turned out that the first non-standard port I used, 1432, was already open through the firewall so I could connect to the existing default instance and one of the new named instances remotely without having to request a firewall change. The other instances aren't as critical to have outside connectivity to so I can worry about those later.

Saturday, January 31, 2009

Adventures in LINQ: Deferred execution

I recently started a new project in ASP.NET 3.5. It actually a total re-write of an old application I created some time ago. I haven't done any significant web development since the official release of 3.5 (hard to believe) so I needed to get back on the horse and get up to speed. One of the goals is to get familiar with LINQ. One of my co-workers used it for a project we architected and he said that LINQ saved boat loads of effort so I figured it was high time for me to get my act together.

One of the first things I've discovered with LINQ is how chatty it is with the database server. After starting to experiment with LINQ I turned on the SQL Server Profiler so I could watch what LINQ was doing with the database. I noticed far more queries being executed than I'd have expected. Way more than I would execute myself if I were programming the traditional way with my own explicitly defined queries. A key source of the excessive query traffic is the concept of deferred execution.

Here's how it plays out.  I have my digital photos in a file repository on disk.  I'm building a database driven tagging system for them.  In my photo library database, I have a table called "Photo".  This abstracts out to the "Photos" class in the LINQ to SQL classes DBML.  I need to scan the file repository for new photos and check the database to see if they already exist.  Simple enough.  For each photo file I encounter I want to check the database.  It stands to reason that I shouldn't query the DB for each individual file, but rather preload the photo table to an in memory collection then look through the collection for each photo that's encountered.  So I approach it this way:

01 string[] strFiles;
02 strFiles = Directory.GetFiles(dirPath, "*.*");
03
04 var lstPhotos = from p in _db.Photos select p;
05 for(int i = 0; i < strFiles.Length; i++)
06 {
07 intPhotoCount = (from p in lstPhotos
08 where p.RelativePath == strFiles[i]
09 select p).Count();
10 if(intPhotoCount == 0)
11 {
12 // do create photo stuff
13 }
14 }

What I expected to see was a single query fired against the database at line 04.  But I didn't.  Instead, I got one query for each iteration of the loop, fired at line 07-09.  This query, as suggested by the LINQ, contained an explicit 'where' clause for the RelativePath value test.  

exec sp_executesql N'SELECT COUNT(*) AS [value]
FROM [dbo].[Photo] AS [t0]
WHERE [t0].[RelativePath] = @p0',
N'@p0 varchar(30)',@p0='.JPG'

With many thousands of photos, this isn't good.  So I thought that maybe if I forced the data context into delivering a full set the collection would preload as I planned.  I added this line immediately after the initial LINQ (after line 04):

lstPhotos.GetEnumerator();

When I ran this, I saw this SQL execute:

SELECT [t0].[PhotoId], [t0].[CollectionId]
, [t0].[RelativePath], [t0].[Title]
, [t0].[Description], [t0].[Height], [t0].[Width]
FROM [dbo].[Photo] AS [t0]

Great! However, it's immediate followed up by the same loop specific queries. All I did was add a superfluous query to the noise. The GetEnumerator() call got me thinking that I could use the actual resulting enumerator. However, my original approach was: for each file encountered, check the existing photo list from the database to see if I've already captured it. In order to use the enumerator, I have to reverse the tactic because we can't find a single item using an enumerator, only go through the enumerator's set front to back.

So let's reverse the approach. Instead of iterating through the file list, we'll go through the LINQ result set. Then I'll remove the file item that matches. The result will be the files that don't yet exist in the database. We can then add those.

01 List lstFiles = new List(strFiles);
02 var lstPhotos = from p in _db.Photos select p;
03 foreach(Photo photo in lstPhotos)
04 {
05 lstFiles.Remove(photo.RelativePath);
06 }
07 if(lstFiles.Count > 0)
08 {
09 for(int i = 0; i < lstFiles.Count; i++)
10 {
11 // do new photo stuff
12 }
13 }

First I create a generic collection out of the original file name string array (return type of Directory.GetFiles()). After the LINQ for the complete result set from Photos, I do a foreach, which calls .GetEnumerator() on lstPhotos. This results in a single SQL query for the entire result set. Then I iterate through the results, removing any matches. Last, I do the actual creation of the remaining items.

I find more and more with .NET that I must work as though I'm carving an ice sculpture. Instead of trying to build something up with little cubes it's often easier to start with a single large block and take away what you don't need. This is a surprisingly applicable analogy. While you need a lot more space (memory) to start with the large block of ice (data), once you've chipped away what you don't need, it's easily cleaned up (garbage collection). The alternative is a process that takes considerably longer and may ultimately be more fragile in the end.

It's amazing how often thinking backwards can lead to a better solution.

Monday, November 03, 2008

Google vote map wins 3 to 1, but I'll had to drive 3 hours

Apparently "Pulaski" is a common Ukrainian name, and they like to put American Citizens Clubs on streets with that name.

I saw Scott Hanselmans's tweet about Google's 2008 US Voter Info map.

For the heck of it, I tried my current address and 3 places I used to live, one of which being the town of Cohoes, outside of Albany NY (the capital city, in what is considered to be upstate NY).

Google suggested that I drive 170 mi to the Ukrainian American Citizens Club on Pulaski Avenue in Staten Island. Interestingly, the place I *did* vote 4 years ago was the Ukrainian American Citizens Club on Pulaski Street in Cohoes, NY. They had the right place, (almost) the right street name, just the wrong city.

It's a good thing I moved, or I'd have to forfeit my vote as the absentee ballot probably wouldn't get there in time.

Wednesday, October 15, 2008

Tron guy is a PC

Who says Microsoft doesn't have a sense of humor? Not only are they copying the image of John Hodgman's buttoned up "PC" from the Mac ads, a recent marketing email (obviously a long delayed rebuttal to those ads which accompanies the TV ad series) includes a thumbnail of none other than the Tron guy.

Thursday, October 02, 2008

Sometimes Google scares me

My parents just got a new dog.  They sent me a short note about it when it came home for the first time.  In the email the following words are mentioned that relate to dogs within the context of the message, those that could be understood to have to do with dogs outside the message context are in bold:
  • spayed
  • standard poodle
  • lab
  • brush
  • paw (lots of animals have paws, and it's a verb)
My gmail account showed some ads as it always does, they can be seen to the right.  This was one of the rare occasions that I actually looked at them.

Among the ads is one for "Australian Labradoodles".   My parent's new dog just happens to be a lab poodle mix.  However, nowhere in the message is that mentioned apart from the simultaneous occurrence of "lab" with "poodle" in the message.  Obviously though, any astute advertiser would pair those words for their labradoodle business.  Interestingly though, none of the ads seem to directly deal with having animals fixed.

It just amazes me how accurate the google ads can be given so little to go on.  I recently finished reading "The Google Story" which explains a bit about the general ideas behind google's methodologies of search ranking, but doesn't say too much about it's ad matching techniques apart from the word bidding concepts (understandably).  (Very good book, by the way, I highly recommend it.)

I wonder how similar the results would be if I wrote an email saying:
After I brush my teeth I'm going to paw through a catalog then take the poodle to be spayed.

Wednesday, September 10, 2008

Identifying servers in a web farm with IIS headers

Once your application has been deployed to a web server farm, it can become tricky to track down problems.  Particularly when a problem occurs intermittently.  Sometimes these intermittent problems are such because they are occurring on only one machine of a web farm.  

Identifying the problem server can be rather challenging.  Often the first attempt is to modify your local DNS (in Windows it's the HOSTS file) to point the site URL to a single machine.  Depending on how your web farm is set up you may not be able to do this because the individual machines may not be visible to you.  Only the farm's pool address is visible.  Furthermore, sometimes the problems we encounter do not manifest themselves when running on a single environment (otherwise we'd have caught them in development right??).  To complicate the matters moreso, often the only chance you have to identify on which machine the problem occurred is right when it occurred, as in, when you are staring at the application crash page.  Simply attempting to replicate the problem after you set up your tracking may not be enough.

A simple solution I have implemented on our staging and production web farms involves nothing more than the built in HTTP headers supplied by IIS.  First, just add an HTTP header to each machine in the farm that contains the name of the machine, or any other unique value that you can map to the machine:


Then, when you browse to the site or are looking at an error message, you can open a tool like Fiddler or FireBug to view the page's HTTP header information for the response.




Particularly with a tool such as FireBug or another DOM inspector, you can get immediate information without having to start any kind of tracking tool or needing to relaunch the site.