Many times I’ve gone to Microsoft’s Connect Feedback site to report bugs about Visual Studio or SQL Server, only to find that the issue was already reported and a MS rep closed the issue due to inability to reproduce the issue. This is absurd! The fact that there are no detailed application logs that gives MS everything they need to know to determine the cause of an issue is inexcusable. They miss countless opportunities to improve their products, and it can be very discouraging for users who went through the effort to provide detailed bug reports.
When deploying solutions to a large number of users I found it to be critical that I have detailed logging for applications and installers. I worked out a system of logging for our .NET applications using log4net which allows for both rolling log files and limiting the space the log files occupy. It can also be configured at runtime to turn off or on logging, so that performance critical applications can run without the logging. For our installers we used the special logging build of NSIS along with some additional functions to support logging.
There were many occasions we were able to determine the cause of incidental or hard to reproduce issues by looking at these logs. In situations where the developer/tester does not have access to the user’s machine, being able to determine almost everything you need to know from the log, provided by the user, is invaluable. It also clears up the sometimes vague description of the problem and steps provided by the user. You can see exactly what the user was doing leading up to the issue. There were even times that even with the logs, we could not reproduce because of the user’s unique setup, but we were able to determine why it was occurring, fix the issue, and then send the fix to the user for them to verify that it did indeed fix the issue. How awesome is that?
If you ever hear yourself telling a user “I was unable to reproduce your issue”, then your product is to blame. This is not roulette, these are fully deterministic computers (the random numbers are not even really random). So don’t act like it is just bad luck or misfortune on the user’s part. It is your application’s design and lack of adequate feedback and logging that have failed. Logging is one of the easiest things to implement. So stop making excuses and start logging.