Random Web Developer: 2009

Tuesday, September 29, 2009

Learning a New Language

Since I got my Mac, I've been trying to learn Objective-C, Cocoa, Cocoa Touch, Xcode, and the rest of the iPhone development stack. As I spend more and more time on this, I've noticed some parallels between learning a new programming language and learning a new spoken language.

A few years back I moved to Central America and, among other things, learned Spanish. Learning Spanish was a long, slow, difficult process. At one point in the process (maybe a few months in) I realized that Spanish is not just a literal translation of English. That is, you can't just translate the words from English to Spanish. You also can't just translate the grammar from English to Spanish.

Aside from the basic spelling and grammar rules, languages are made up of many things such as styles, idioms, phrases, mannerisms, local accents, dialects, and more. Because of this, the trick to learning a new language (once you already have a basic understanding of the syntax and grammar) is to try and think in the new language. You want to try and make the dialects and colloquialisms your own. You don't want to just translate directly as your speaking as this usually leads to pidgin. You want to fluently express yourself in this new language and be understood as well as possible.

Translating this to learning a new programming language, you can see that just knowing the syntax and APIs of the new language is not enough. When writing code in a new language you want to own that new language. You want the code to look like it was written by someone fluent and familiar with the language. This will greatly improve the quality and maintainability of the code.

This means that when you learn a new programming language, you should learn the coding standards as well. So when you write code in C#, it should look like C# and should not use hungarian notation. And when you write code in Objective-C, it should use the Objective-C styles and it shouldn't look like .NET or anything else. It can be difficult to adjust to different code/language styles but you haven't really learned the new language until you've done so.

Wednesday, September 16, 2009

How much to you trust Microsoft?

Being a .NET (Microsoft stack) developer requires a substantial amount of trust in Microsoft. I have to trust them to stay in business. I have to trust that having in-depth knowledge of there technologies is a good economic decision for my years ahead. I have to trust them to fix bugs in there software, trust the quality of their products like IIS, Windows server, Visual Studio, SQL Server, and others, and trust them to support the technologies I use for the foreseeable future. I have to trust that the web sites I develop will work properly when running on their platform.

This is a lot of trust. And Microsoft has been working hard for years to gain it. But now they are asking for an entirely new level of trust. With Microsoft's cloud computing services called Azure, Microsoft is asking us to not only trust their software but their hardware as well. They're now asking us to trust them to take care of all the infrastructure and to trust that everything will work just fine in their hands.

This is a big step. And so I wonder if Scott Guthrie's announcement yesterday about putting jQuery and Ajax.NET on their new Ajax CDN is not just an act of goodwill but is also intended to help build trust in Microsoft hosted web content. Maybe it's a bit of both.

Friday, September 11, 2009

Performance of an Asp.Net Page versus an Http Handler

A coworker recently mentioned to me that he was thinking about using an http handler instead of a page for performance reasons. My first thought was of a quote I saw in a comment on StackOverflow "He who sacrifices correctness for performance deserves neither." I like the quote. But it doesn't really say much about the problem at hand.

My second thought about these options is that I have trouble imagining that the ASP.Net page life cycle's way of calling events and virtual functions would really add much overhead. Especially if these requests will have to communicate with the database. I would think that the cost of calling the database would far outweigh the cost of generating the page.

But talk is cheap. The code for my page is very simple:

<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="PerfTestPage.aspx.cs" Inherits="WebSandBox.PerfTestPage" %>Hello

The code-behind for the page is just an empty partial class declaration. "Hello" is on the same line as the page declaration so I don't end up with extra "\r\n" charecters in my response stream.

The code for the handler is equally simple:

public void ProcessRequest(HttpContext context)
{
  context.Response.Write("Hello");
}

After wiring up the page and the handler in IIS I created a seperate project to do the performance testing. The call to get the request does a simple GET request and ensures that the text "Hello" came back on the response.

public static void DoRequest(string url)
{
  WebRequest request = WebRequest.Create(url);
  request.Method = "GET";
  WebResponse response = request.GetResponse();
  string resp = (new StreamReader(response.GetResponseStream())).ReadToEnd();
  if (resp != "Hello")
      throw new Exception();
}

To ensure that the test is fair, I'm going to call each request a bunch of times before actually timing them. This will ensure that I'm not testing calls that are warming up IIS. In addition, this is all locally so network/wire time should not be a factor in this. Here is the test runner:

static void Main(string[] args)
{
  var page = "http://localhost/WebSandBox/PerfTestPage.aspx";
  InitializeTest(page);
  var pageTime = RunTest(page);

  var handler = "http://localhost/WebSandBox/PerfTestHandler.axd";
  InitializeTest(handler);
  var handlerTime = RunTest(handler);

  Console.WriteLine("page time: {0} ms", pageTime);
  Console.WriteLine("handler time time: {0} ms", handlerTime);

  Console.ReadKey();
}

static void InitializeTest(string url)
{
  for (int i = 0; i < 20; i++)
  {
      PerfTester.DoRequest(url);
    }
}

static long RunTest(string url)
{
  var sw = new Stopwatch();
  for (int i = 0; i < 1000; i++)
  {
      sw.Start();
      PerfTester.DoRequest(url);
      sw.Stop();
  }
  return sw.ElapsedMilliseconds;
}

For this test we're executing each request 1,000 times to make the time more significant. Here are the results:

	Page (ms per 1,000 requests)	Handler (ms per 1,000 requests)
	6186	6259
	6152	6216
	6145	6182
	6156	6208
	6133	6201
	6169	6190
	6536	6416
	6161	6190
	6162	6189
Average	6200	6228

I don't understand why or how the page is faster than the handler. Maybe IIS is doing some kind of caching. Or maybe there was additional traffic and CPU usage on my computer during these tests. But at around 6.2ms per request, I don't see much of a difference.

Also, this test is only for very simple text "Hello". There would of course be additional overhead when using Asp.Net controls on the page. But I would doubt that the overhead of these controls would far outweigh whatever method the handler would use to generate the same HTML. And that's another topic entirely.

The moral of today's story, "measure twice, cut once".

Thursday, September 3, 2009

My new MacBook

This Monday I received a MacBook so I could create an iPhone application.

This is the first time I'm using a Mac since my mom brought home something like this before I was in highschool. This is also the first time I'll be using a linux type of system since college. Needless to say I'm very excited.

I've started playing with it a little. I downloaded XCode and the iPhone SDK. You can see in the photo that I learned how to change the background color. I think this is probably one of the most common things people do right after turning on a new computer for the first time.

After using C# for the last bunch of years, Objective-C is like getting in a time machine and going back 10 years. So many practices that I've come to take for granted are gone. I'm doubt I'll ever come to grips with the "NS" prefix for everything.

At the same time, the MVC framework to Cocoa seems pretty cool. And the idea of message-passing for dynamic invocation of methods in a strongly typed system is very interesting.

In addition to my thoughts and feelings about the platform, framework, language, hardware, etc. it's also cool to be starting from scratch again. I barely know how to save a file on the Mac. I haven't explicitly used a pointer in almost 10 years. I'm excited to see what the learning curve is like and exactly what programming skills really do transfer between languages, frameworks, etc. I've been learning F#, haskell, python, and ruby out of personal curiosity but now I have to actually build something in Objective-C. Should be interesting.

Tuesday, September 1, 2009

Dynamic Stored Procedure Execution

With the imminent arrival of the dynamic keyword to C# I think we will be seeing more prototypes of ways of putting dynamic objects to use within .NET. One interesting use is Phil Haack's method of HTML encoding properties that are predicated with an underscore. After reading about this I wanted to try creating my own dynamically driven class.

Disclaimer: I'm still on the fence over the use of dynamic in C#. I'm not advocating the use of this code. I just wanted to see if it was possible and what it would look like.

One problem with ORMs is mapping stored procedures. This is usually a manual process because stored procedures are inherently not typed. So I was envisioning something along the lines of:

IList<Person> people = DataContext.Sproc.ExecuteGetList<Person>.spGetPeopleByDOB(dob: myDate);

For the purpose of this prototype, I decided not to use any real data context. That could be added in easily enough. For now, I'll be rolling my own data layer.

Step 1 - Create a simple DataContext:

Step 2 - Create the Sproc class

This class will basically be a factory for different dynamic stored procedure executers. This way we can easily differentiate between ExecuteScalar, ExecuteNonQuery, getting an object, and getting a list of objects. For now, I'm going to implement ExecuteScalar<T>:

Step 3 - Create the DynamicExecuteScalar<T> class

The real magic here is the use of TryInvokeMember. This is the function that will be invoked at runtime whenever someone calls a method on your DynamicObject. So when I call myDynamicExecuteScalar.CallingSomeMethod() I'm actually calling TryInvokeMember.

Inside of TryInvokeMember we use binder.Name to get the name of the method called (eg. "CallingSomeMethod"). Then we use the binder.CallInfo.ArgumentNames and args[] to get a list of named parameters that we can turn into our sql parameters.

Once you have this code wired up, you can get a scalar for any stored procedure from your code simply by calling:

And using this as a starting point you can add ExecuteNonQuery, ExecuteGetList<T>, etc. You can also modify the data context to be more extensible and mockable.

Tuesday, August 25, 2009

Design vs. Structure

In many conversations about development, words like design and structure are thrown around. It is important then when discussing software development to understand the differences between the two.

According to Merriam-Webster, design is

a particular purpose held in view by an individual or group... deliberate purposive planning

The definition of structure is

something arranged in a definite pattern of organization... manner of construction

The difference is that a design is about purpose while structure is about organization. Code that is structured but has no purpose for it's structure may as well not be structured at all. There is no cohesion, no single responsibilities, and no reason for a class to exist if it exists for structure alone. When we set out to structure our code, we should have a design purpose in mind.

Another way of saying this is "form follows function". Determine the function and the form will follow. Determine the design and the structure will follow.

Monday, August 24, 2009

Member Variables are not Global Variables

It is widely accepted that global variables, in almost any context, are a bad practice. But there is a common practice that leads to code that is harder to maintain - the use of member variables for passing information between functions for a particular set of functionality. This is a subtle way of creating global variables in that they are "global" between functions.

As an example, let's look at the code below. For this example, we're going to assume that "memberVariable" is not used anywhere else but "DoA()" and "DoB()".

private bool memberVariable = false;

public void MainFunction()
{
   DoA();
   DoB();
}

private void DoA()
{
   // do some stuff

   if(something)
       memberVariable = true;
   else
       memberVariable = false;

   // maybe do something else
}

private void DoB()
{
   // do some stuff

   if(memberVariable)
       DoOneThing();
   else
       DoAnother();

   // do more stuff
}

Member variables are used for maintaining the state of the particular object. But as can be seen in this case, "memberVariable" does not contain state. It contains a flag that is particular to the logic of "MainFunction()". This is a bad practice because it is very hard to know what "DoB()" is going to do without also examining "DoA()". If this were expanded further and the class had many functions, it would be difficult to know who uses "memberVariable" how and why.

There are a number of ways to refactor this code to change the use of "memberVariable". DoB() could take memberVariable as a parameter or retrieve it's value by calling another function. Even though there are different ways to alleviate the problem, the idea is the same. Functions should be autonomous and not require other functions being executed in order to work properly. And even if a "global" variable is only global to two private functions, it's still global.

Wednesday, August 19, 2009

Don't Arbitrarily Reference the Base Class

All too often I have seen developers referencing properties and methods of the base class directly. Instead of calling “this.SomePropertyOrMethod()” or just “SomePropertyOrMethod()” they call “base.SomePropertyOrMethod()”. This has a number of drawbacks.

1) It implies that a class knows something specific about the parent.

2) It is now difficult to override the method in this class or a class that inherits from the current class. The base version of the method will always be called even if the method is overridden.

3) This defeats the purpose of object oriented programming. Calling the base class directly is akin to calling the same method on a utility class. The only difference is that the base class might be looking at the current class’s state. It’s just messy.

4) This is a code smell that makes me think of the Base Bean anti-pattern. “A class should not inherit from another class simply because the parent class contains functionality needed in the subclass.”

Now there are definitely times when this is necessary and that is why the keyword exists in the first place. If I’m overriding Page.OnLoad, I should call base.OnLoad at some point. If I override List<>.Add, I better call base.Add() to add the item to the list.

But just because you can do something doesn’t mean that you should. And this keyword is no different. Most cases of base.SomePropertyOrMethod() should be replaced with this.SomePropertyOrMethod().

Tuesday, August 11, 2009

Maintenance Cost and C# Auto Properties

Auto properties in C# provide a simple way to implement a property on a class without explicitly creating the field that the property accessors (get and set) will work on. There have been many discussions on whether or not auto properties are good, their advantages, and their disadvantages. Looked at from a code-maintenance perspective, an auto property is much better.

When maintaining code, one big problem is making sure that the code you’re maintaining isn’t doing anything tricky that will make it difficult to modify or fix. This is one of the reasons why SRP and DRY are so important. If there’s only one place that does something and that place is only responsible for the one thing that it does, a code maintainer’s life is very joyful. When this is not the case, a small change or fix may have detrimental repercussions.

With this in mind, there is a maintenance cost to not using auto properties. If there’s a field hidden under your property, as a maintainer of your code, I now have to check all uses of the field as well as all uses of the property. With an auto property, the property is the only thing that needs to be investigated.

When explicitly creating a field for a property it is important to consider the costs of doing so. In addition, it’s important to ask if this field is really necessary. Can all your methods access the property instead? If so, why even have this extra level of misdirection. Chances are YAGNI plays a big part here.

Wednesday, March 18, 2009

DeflateStream versus GZipStream

DeflateStream and GZipStream are the two compression classes that come standard in the System.IO.Compression namespace. If you're looking for all out performance, DeflateStream is the answer. According to msdn,

The GZipStream class uses the gzip data format, which includes a cyclic redundancy check value for detecting data corruption. The gzip data format uses the same compression algorithm as the DeflateStream class.

GZipStream is basically DeflateStream with some additional functionality and integrity checking. So if you only care about speed, DeflateStream will be faster.

Friday, January 2, 2009

How I Didn't Get in Trouble

A few months ago I noticed a significant performance problem during the application startup of the really big, web system I work on. It was literally taking up to a minute and a half for IIS to load up the application.

After some digging I found out that the application was loading about a million records or so from the database into the application's cache. Regardless of the decision to implement this, I did not want this happening locally. I couldn't test any of my work. I couldn't develop.

Where I work, even though we all have a local development environment, we all share the same development database server. And these million or so records were being loaded by a stored procedure aptly named spGetAll[Things].

I notified my boss and the developer responsible for creating this stored procedure about my problem and told them that I was practically unable to work and that we had to change this behavior in our development environment. They said that it would be done... And I waited... and waited...

And after a day and a half, I modified the stored procedure to return only the top 20 rows - problem solved! But I knew that this could cause problems for someone else who might be debugging a problem related to this data and find that the data was missing.

So I waited, expecting someone to yell about the missing data and then at me. But I could work again. Other's could work again. Adding "Top 20" to a stored procedure on a development database saved countless minutes and probably a lot of money for the company.

Finally the email arrived. "There is a problem with the data in such and such table and we're going to overwrite it with what's on production. Is this OK?" So I humbly sent out an email saying that I modified the spGetAll[Things] and that maybe this is the problem.

Here I am, thinking that I'm going to get in trouble, that some poor developer spent hours or even days trying to track down a problem that I intentionally created. And what happens? They apologize to ME. OMG. They said that they were sorry but they'd have to fix the stored procedure.

"Ok" by me. They fixed the stored procedure. And I quickly add a configuration variable to use a new stored procedure spGetAll[Things]Truncated. I probably should have done this in the first place but adding "Top 20" is just so much simpler. Especially when it's done under the radar.

The moral: if I didn't fess-up to what I'd done I could have gotten in real trouble... and maybe even fired. Don't be ashamed of your work. Even when it's bad. There is no way to work in a team if you hide what you've done. And you'll see that people are very forgiving when you admit to making a mistake.

Random Web Developer