Wednesday, 13 July 2011

Microsoft Office Interop - Tricks and Traps

Having spent the last week playing with Microsoft Office interop, 2 things have become very clear. Firstly it is extremely powerful and can do most of the things you are likely to need, and secondly, it's a PITA to get right!

The first thing you will notice as you work through the different libraries is it's fairly clear that while there was some standardisation between the teams, just because a particular process works in Word (for example) there is no guarantee that the same process will work in Excel or PowerPoint. A good illustration of this is trying to change the printer.

In Word you can simply set the ActivePrinter property of your Microsoft.Office.Interop.Word._Application object to the name of the printer you want to use, for example "HP LaserJet 4250 PCL6". In Excel however it wants a name including the printer port, this is supplied in the form "HP LaserJet 4250 PCL6 on LPT1". The problem is the port is not always the same as the one you can see in the printer properties page - and to compound the issue, the word 'on' is dependant on locale! Luckily if you have a Google around some helpful people have already written functions to help you get the right name for your printer - or you could do what I did and just set the printer to default and get your code to output the name of the current ActivePrinter :). Oh and did I mention when you set the ActivePrinter in Word it changes the system default printer... but Excel doesn't. Anyway you get the gist, this is a minefield and you can't take anything for granted.

This blog post is intended to cover 2 of the big issues I suffered with while I was playing with interop, cleaning up the COM objects and using Office interop from a Windows service. This article was written against Microsoft Office v12 (2007) interop libraries.


Rules for CLEAN use of Microsoft Office Interop

There are lots of examples of using interop on the internet and most of them are quick apps that have been knocked up to solve some problem with a business process - like adding a company footer to a large selection of old Word documents. Most of them do work, however very few of them actually take the time to talk about good practice and how to make sure you don't end up making a mess of your documents leaving locked files and application instances everywhere.

Here are some simple ideas that can help you keep things tidy with interop. As I mentioned before each Office application has slightly different support and I have found Word to be fairly tolerant of lazy coding, Outlook seems to be the real stickler for best practice so always...


Remember to Close and Exit

Ok so the first bit of information shouldn't come as a surprise, close applications when you are not using them. Just as when you are using Office directly when you are finished with a program you exit it, interop is the same. There is of course a small difference and that is you should try to close all your documents before exiting your application. This is another way to help keep things tidy. Also remember if you don't want to bother the user with any UI from Office always look for parameters to stop that. In this case I have forced the message to be discarded when I close it, that will stop a save box from displaying.


using MSi = Microsoft.Office.Interop;

            MSi.Outlook._Application outlk = new MSi.Outlook.Application();
            MSi.Outlook.NameSpace nmspace = outlk.GetNamespace("MAPI");
            MSi.Outlook.MailItem message = nmspace.OpenSharedItem("C:\myEmail.msg");

            message.Close(MSi.Outlook.OlInspectorClose.olDiscard);
            CleanUpComObject(message);
            CleanUpComObject(nmspace);
            outlk.Quit();
            CleanUpComObject(outlk);


As you can see I close the message and dispose it before I quit the application and dispose it. Most of the different programs support Close as a method on the document (message, spreadsheet, presentation ect) and Exit on the application. If you can't find those I would suggest having a poke around to see if there is a different method that needs to be called. After closing/exiting you need to clean up and it's easiest if you...


Have a clean up routine

As always in development, if you're going to do something a lot of times, make a function to do it for you. Here is mine:


        public void CleanUpComObject(object obj)
        {
            try
            {
                if (obj != null && System.Runtime.InteropServices.Marshal.IsComObject(obj))
                {
                    System.Runtime.InteropServices.Marshal.FinalReleaseComObject(obj);
                    obj = null;
                }
            }
            catch (Exception ex)
            {
                // You're probably going to want to log this
            }
        }


As you can see it's a simple function that checks that obj exists and it's a valid COM object - if it is then it's released and the reference is set to null. I have used FinalReleaseComObject rather than the more commonly used ReleaseComObject as if you are going to use ReleaseComObject you should really call it multiple times until it returns 0, FinalReleaseComObject saves you this hassle. Also I have put a try catch round the whole lot, this is obviously important. If you're trying to clean up a load of COM objects and the first one fails you don't want the thrown exception to cause the code to skip the rest of the house keeping. So what should you pass to this? Well you need to...


Keep references to everything

Another easy mistake to make is forgetting that while something might appear to be one object to you, inside those friendly .NET libraries all sorts of crazy things could be happening. Take this code as an example:


using MSi = Microsoft.Office.Interop;

            MSi.Outlook._Application outlk = new MSi.Outlook.Application();
            MSi.Outlook.NameSpace nmspace = outlk.GetNamespace("MAPI");
            MSi.Outlook.MailItem message = nmspace.OpenSharedItem("C:\myEmail.msg");

            if (message.Attachments.Count > 0)
            {
                        MessageBox.Show("There are attachments!");
            }

            message.Close(MSi.Outlook.OlInspectorClose.olDiscard);
            CleanUpComObject(message);
            CleanUpComObject(nmspace);
            outlk.Quit();
            CleanUpComObject(outlk);


Looks good, the message is closed, the application is exited and everything is cleaned up, but still this code will leave an instance of OUTLOOK in task manager. The access of the Count on the Attachments property on the message object has caused a new object to be created behind the scenes and that hasn't been cleaned up. The fixed code looks like:


using MSi = Microsoft.Office.Interop;

            MSi.Outlook._Application outlk = new MSi.Outlook.Application();
            MSi.Outlook.NameSpace nmspace = outlk.GetNamespace("MAPI");
            MSi.Outlook.MailItem message = nmspace.OpenSharedItem("C:\myEmail.msg");
            MSi.Outlook.Attachments attmnts = message.Attachments;

            if (attmnts.Count > 0)
            {
                        MessageBox.Show("There are attachments!");
            }

            message.Close(MSi.Outlook.OlInspectorClose.olDiscard);
            CleanUpComObject(attmnts);
            CleanUpComObject(message);
            CleanUpComObject(nmspace);
            outlk.Quit();
            CleanUpComObject(outlk);


Now we are storing the reference to the attachments object we can free it and voilĂ , not OUTLOOK.EXE left running in task manager. So in that same vein...


Be careful of what you let access your interop objects

We know that anything we explicitly access may need free'ing when we are done and that's reasonably easy to manage, but sometimes it's not particularly obvious what objects we need to free. Take this example (this block of code would replace the MessageBox in the last example to make a block of working code):


                    foreach (MSi.Outlook.Attachment item in attmnts)
                    {
                        item.SaveAsFile(FileFunctions.BuildFileName(attchmentDumpPath, item.FileName));
                        CleanUpComObject(item, logger);
                    }


So we are going through each attachment and saving it to disk, and as we already learned we are cleaning up each attachment we use as we go. However if you try this code you will find very quickly that it doesn't clean up after itself and rogue OUTLOOK.EXE processes will be hanging round. This is because the foreach loop creates other objects in the background that you can't reference, and therefore can't release - I believe in this case it's an iterator that doesn't get cleaned up. DO NOT use foreach loops with interop objects, be old school and manage the loops yourself ;)


                    MSi.Outlook.Attachment att = null;

                    for (int j = 1; j <= attmnts.Count; j++)
                    {
                        att = attmnts[j];
                        att.SaveAsFile(FileFunctions.BuildFileName(attchmentDumpPath, att.FileName));
                        CleanUpComObject(att, logger);
                    }


The above code works fine and makes sure there are no objects left laying around, that is assuming there isn't an error saving the attachment! With that in mind make sure you always...


Put the clean up in a finally

I know this is first rule of using objects like this but it's still easy to forget - in writing this I spotted some clean up code that wasn't guaranteed to execute. If I hadn't found it you know the customer would have, so check and check again. As we have already got a fairly solid clean up function that should stop errors or at least handle them without too much drama you just have to move that into a finally block (catch removed for clarity) :


            MSi.Outlook._Application outlk = null;
            MSi.Outlook.NameSpace nmspace = null;
            MSi.Outlook.MailItem message = null;

            try
            {
                 MSi.Outlook._Application outlk = new MSi.Outlook.Application();
                 MSi.Outlook.NameSpace nmspace = outlk.GetNamespace("MAPI");
                 MSi.Outlook.MailItem message = nmspace.OpenSharedItem("C:\myEmail.msg");
            }
            finally
            {
                 message.Close(MSi.Outlook.OlInspectorClose.olDiscard);
                 CleanUpComObject(message);
                 CleanUpComObject(nmspace);
                 outlk.Quit();
                 CleanUpComObject(outlk);

            }


Now that's not bad, we know that CleanUpComObject isn't going to throw but it's plausible that message or outlk could, they could even be null at that point and that will cause the rest of the clean up to be skipped. To cover this you need to make it all a little bit more messy with a finally that looks more like this:


            finally
            {
                if (message != null)
                {
                    message.Close(MSi.Outlook.OlInspectorClose.olDiscard);
                    CleanUpComObject(message);
                }

                CleanUpComObject(nmspace);

                if (outlk != null)
                {
                    outlk.Quit();
                    CleanUpComObject(outlk);
                }
            }


Now we are protected and are doing everything we can to stop COM objects being left hanging around. You could of course make a more generic clean up function that takes an array of objects and uses reflection to call any Close/Exit functions before disposing of them. While it would make the finally more tidy there will be quite a heavy runtime overhead for that so personally I didn't choose that route.


Microsoft Office Interop from a Windows service

This might seem like an extremely specific title for a section especially compared to the rest of the scope in this post and I guess it is. The reason I think it deserves a whole section is that this problem took me as long to research and fix as all the points in the first part put together! Again there is quite a lot of information out there but mostly you're going to find threads with no answers or even worse someone describing your exact problem with the original poster coming back to say "Don't worry I fixed it now, thanks!" - seriously if I ever find you I'm going to hurt you! Why you think that your question is important enough to post it to the entire development community but the answer is not fit for public consumption I will never know. If you ever post question threads please make sure you go back an update them when you solve the problem because guess what, you're not the only person in the world who is ever going to experience that. Even if the answer is "Yeah I was being a tard and I used a double instead of an int" - it doesn't matter! I've probably made the same mistake, or if I haven't I know your problem was unrelated. Remember it's a community not a service. Sorry, rant over.


Gotchas

While researching this section further I actually found another blog post which covers the issues you are likely to come up against extremely well. Not being a fan of duplication I shall just link to the post from Malte who has put the symptoms and exact steps to fix them. I would just add that as with the rest of interop each Office application will give you a different errors for the same problem. In this case the second problem does report as file access problems in Excel but just gives a general error with Word - be on your toes.

You can find Malte's article here.




Outlook Security Update

There is also an Outlook security update that can cause problems with interop. It's primary purpose is to block any 'suspicious' interop activity. What constitutes suspicious - I have no idea, however what I do know is this helpful feature cost me 2 hours when I deployed to our staging system. If you are running Outlook interop from a service this is extremely hard to diagnose as you will make a call and it will simply never return. If you're running from an app you will at least see a pop-up similar to the one below:


The text may change slightly but essentially the message will read "A program is trying to access data from Outlook that may include address book information. Do you want to allow this?". This is a security setting inside Outlook that you can turn off, thankfully. Load Outlook and select Tools > Trust Center - > Programmatic Access and select the last option, never warn. This should stop you getting these messages in the future and now calls to the methods that were breaking before (in my case Message.SaveAs) should now return correctly.

I am not sure exactly what the risk is of selecting this option but assuming you are taking the usual precautions (Windows updates, AV, ect) then I personally can't see the problem with this solution, however buyer beware!


I hope this article has been helpful, if you have further questions please leave a comment and I will try to answer it.
-Jynx