Site Archive (Complete)
Windows/.NET
Practicing .NET

Improving developer productivity and software quality

by Mark M. Baker

September 2006


September 25, 2006

Memory is cheap, but not always plentiful


As Monk would say "it's a blessing and it's a curse". I'm speaking of memory management in .NET, not a TV show. But the point is the same. Something which is so helpful and at times be a real pain.

One of .NET's major advantanges for Windows developers is its ability to manage memory more readily than leaving it to the developer. The idea is that memory management including using null pointers, knowing when to "delete" unneeded memory and so on has been the bane of 'C' based developers for years. Just create an object and forget about the process of reclaiming it. You need to create an object and return it to a caller? No problem. Neither side needs to take "ownership" of the resulting object - .NET handles the tracking for you.

Sweet.

Well, not always. In real life, users do unexpected and really annoying things like opening lots of files, printing lots of large documents, and so on. These things can consume lots of object and lots of memory. Worse, since a developer has turned over the reins to memory management to .NET, when memory consumption problems do arise it can be time consumptive to find the culprits.

One of the Visual Studio .NET tools that's sorely lacking is a reference counter/viewer that shows at any point in an application life cyle what objects exist, how may references exist to them and where those references lead. It may uncover a tangled web of interlocking relationships that hogs needed memory and requires some pruning on the part of the developer.

Currently, we're using tools from Red-Gate Software to do the analysis. What're you using?

Let me know.

Posted by Mark M. Baker at 03:21 PM  Permalink |


September 19, 2006

In the valley of filename validation


Yesterday I was working on some code and realized I needed to validate a filename in .NET. Seemed like a straightforward thing to do. I started rummaging around in the File, Directory and Path classes in System.IO looking for something like "IsValid" that I could use.

Nothing. Thus started my journey into the valley of filenames.

I then thought I might be able to use behavior in something like Path.GetFilename to see if it would validate a bogus filename string in a path variable. Although the function throws the ArgumentException if something bad happens, it doesn't work in all cases.

Next I moved on to Path.GetInvalidFileNameChars which returns an array of characters that are *potentially* invalid as filename characters. Hmm. This seemed like a worthy candidate. Then I read through the MSDN help on the API to discover this helpful mention:

The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names. The full set of invalid characters can vary by file system. For example, on Windows-based desktop platforms, invalid path characters might include ASCII/Unicode characters 1 through 31, as well as quote ("), less than (<), greater than (>), pipe (|), backspace (\b), null (\0) and tab (\t).

Well at least they documented the fact it isn't reliable.

So I did what most other engineers do when they're stumped - I hit Google. It didn't take long to find the following blog post by Brian Dewey at Microsoft. The relevant passage is the following:

A common question for people starting to program on Windows is, “What makes a valid Windows file name?” You want to use this information to make simplifying assumptions in your code: that names can be no longer than MAX_PATH, that two names won't differ only by case, etc. Unfortunately, the answer to what makes a valid file name in Windows is not simple.

He goes on to describe peculiarities in the NTFS and Posix subsystems that allow for any Unicode character whereas the Win32 subsystem does not likely due to historical reasons (ie. its Win16/DOS heritage).

By this time, it became apparent there was only one good, reliable and simple way to test for a valid filename - try to create it. So I wrote some code to take the filename and create it in the temp directory. If it worked, the filename was ok (for now I'm ignoring issues where the file already exists and is read-only). If not, the filename was bad on the platform.

Then it occurred to me that this subjectivity may be the reason there is no File.IsValid method in .NET. Made sense after all.

Posted by Mark M. Baker at 07:13 PM  Permalink |


September 08, 2006

Math.Pow to the People


The .NET framework is crammed with nifty and useful small classes that provide commonly needed utility functions. You can't write a substantive application without finding a need for at least a handful of them. If you're working on an application doing computations particularly ones involving equations that need to be solved, you'll probably end up looking at the .NET Math class.

In that class is a method I've had some recent intense exposure to that dredged up memories of high school algebra classes, wondering what I'd do with algebra down the road, and being thankful I grew up just after slide rules disappeared from common use. But I digress.

The method in question in my case is the ever handy Pow method. This little guy takes a double and raises it to a double power. Very handy. However, I quickly ran into some real world issues with it that required cracking open some of those old dusty books and re-reading the rules of algebra.

You see the equation I was toying with deals with raising numbers to an integer power. This makes things a bit easier and straightforward. But Math.Pow although highly useful wasn't designed for the Decimal .NET type, and it isn't tuned to this special case of integer powers. It's handy, but, well.., slow.

I was using it in some algorithms for which I had neither the time nor sanction to overhaul to reduce or eliminate the use of Math.Pow. I was stuck with how it was used. But I wondered if I could use it more efficiently. Thus the turn to algebra.

A little refresher.

Take a number (any number) and raise it to an integer power, let's say, 10. Generally, we'd write that as x ^ 10. Now using the rules of algebra, x ^ 10 = x ^ ( 5 * 2 ). But things get more interesting. We can also write this as x ^ 10 = (( x ^ 5 ) ^ 2 ). That is raise x to the power of 5 and then that value to the power of 2. Mathematically we get the same number in the end. But here's the kicker. When doing an exponential calculation, instead of writing this as multiply x by itself 10 times (which is 10 multiplications), we've reduced that to 5 + 2 = 7 multiplications in the alternative style.

You can generalize this even further by finding the set of integer divisors of an integer that sum to the smallest possible value. This value is the smallest number of multiplications you can do to raise a number to that power. Take a number like 100. That can be rewritten as 10 * 5 * 2 = 17. So if I need to raise a value like 1.2 ^ 100, I can rewrite that as ((1.2 ^ 10) ^ 5) ^ 2). 17 multiplications is a whole lot less than 100.

The sharp reader might notice that a flaw in the approach is dealing with prime numbers like 7 or 13. These numbers have no divisors other than itself and 1, so the approach isn't more optimal for them than just doing the full set of multiplications. But for non primes particularly those with large numbers of divisors, this approach can result in significant performance improvements of Math.Pow.

Finding the set of divisors for an integer that form the smallest possible sum is mechanical and lends itself to a lookup table. Easy to implement too.

So the next time someone asks what good is algebra, tell them that like that old TV show from the 1970's, you can "compute that value in 2 notes". Um, I mean 2 divisors with a little help from basic algebra.

Peace.

Posted by Mark M. Baker at 11:49 PM  Permalink |


September 05, 2006

Would you like some MSIL with your COM ?


I described last time my dilemma in trying to get a COM component to work with my nice little XCOPY deployed .NET 2.0 application. That is, without registering the COM component beforehand. The smarties among you out there probably jumped up and said "Whoa! I can do that with the nifty .NET 2.0 Registry-less COM deployment technology". Yes, you're totally correct - I could have just specified a manifest for my COM component and away I go.

Just one problem though.

It doesn't work on anything less than Windows XP SP2. That rules out Windows 2000, NT and any of the Windows 9x series. In my case, I don't need to worry anymore about NT and the 9x series (hooya!) but I still do need to worry about Windows 2000 - approximately 20% of our customers still use it although the number shrinks slowly every quarter. So I need a solution that is guaranteed to work, not mostly, kinda work.

Now to the solution.

One of the things that strikes me about software development nowadays compared to times past is that we rarely need to dive down and program close to the "metal" or in this case the "CLR". I mean how many of you, honestly, ever cracked open a tome on MSIL and read the instruction set let alone wrote any code with it? Hmmm. Don't see a lot of hands out there. Mine is certainly not one of them.

But in this case, a little low-level MSIL was what I needed. Without further delay, here's the code that registers any COM component given a IntPtr value from a Native Interop call to the Windows function GetProcAddress:

.assembly extern mscorlib {}
.assembly ComSupport {}

.namespace MyCompany.MsIlTools
{
.class public ComSupport
{
.method public static void Register(native int pfn)
{
.maxstack 1
ldarg.0 // Push pfn onto the execution stack
calli unmanaged stdcall void()
ret
}
}
}

Wow! For a low-level language, that's pretty readable even if you've never tinkered much with MSIL. Looks pretty much like a C# static function that takes a function pointer (even though C# doesn't support function pointers) and calls it as an unmanaged bit of code. Since we know DllRegisterServer (the entry point into a DLL to register it) takes no parameters, the setup and call to the function pointer is trivial.

To use this code, you drop it into a .IL file, compile it with ILASM.EXE to an assembly and add the assembly to your project. Then call the "Register" function just as you would any C# static function in a class.

Nice.

Posted by Mark M. Baker at 10:40 PM  Permalink |



December 2008
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      


BLOGROLL
 
INFO-LINK


Related Sites: DotNetJunkies, SD Expo, SqlJunkies