For those who look for a quick solution:
Try switching off the win32manifest-switch in Visual Studio.
Here is the long story.
Hi! As you may already know we are working on the best .NET – ARIS interface there is so far. It’s an interface that makes heavy use of an old C-Api, using interop and p/invoke.
In Mid-December Benjamin found a very strange behavior that almost spoiled my Christmas. We had just release Version 1.1 when he called me: “Arisan crashes with Vista64.” I was shocked. “And with Vista32 as well”. I felt annihilated. We had done so much testing with various versions of XP, Vista and it worked fine with any of those OSes.
After some testing I saw that ARIS didn’t deliver any pointers at all. That’s rather bad for an interface to C that uses pointer in every call. After one more day we had more evidence: We had switched to .NET3.5 and VS2008 some month ago and our old versions, that were compiled using the .NET2.0-compiler, still worked.
I was so terrified. The 3.5-framework should be similar to the 2.0-framework, just some (not so) little enhancements. A bug in the 3.5-compiler?
To provide a quick solution we re-wrote parts of the fancy 3.5-syntax, to make our newest enhancement compile with the 2.0-compiler again. That worked (at least we thought so) and we released version 1.1.1.
Then a very frustrating time began. We read articles over and over and quickly found this very interesting post. The problems described there where exactly what we saw. So we followed the track of the nxcompat-flag and DEP, but it didn’t lead us anywhere.
The C-Api we interface is well documented, but it calls hundreds of other components and one of them crashed deep inside an undiscovered country.
We read pages of differences between 2.0- and 3.5-compiler, new compiler-flags (should have paid more attention here), any documentation we could find, but nothing seemed to help.
We had a feeling it could have something to do with Vista UAC, but we didn’t find any matching explanations to our problem. Being really desperate we started this thread at the MS forums. As you can read there, no one could provide a solution that worked.
I was really sad, because the last thing I wanted to do was to switch the entire development of arisan back to VS2005. I got prepared for a very sad life from now on.
But five days ago my life turned to happiness again. One more time I started of by googling words like “.NET compiler 2.0 3.5 differences” and one more time I read the new features of the 3.5-compiler. This time I obviously paid more attention to the compiler-flags and read about the /win32manifest and /nowin32manifest – switches. I didn’t have much hope left, but tried to set the /nowin32manifest switch.
It worked! Everything worked! The C-Api delivered pointers again. I tried switching the win32manifest on and off and a day later Benjamin confirmed my results. We had found it. But after a long time of suffering (for me it felt like half a year) we still were skeptical and thought we needed a deeper understanding of the problem.
The win32manifest and UAC
Finally we had something to look for and quickly found valuable information on the great I’m just sayin’ blog. The documentation of the switches helped as well, so here is a summary:
The win32manifest is not the .NET-manifest.
The win32manifest is part of the UAC (user access control). UAC was introduced with Vista as an answer to more and more security threads. One of its ideas is to keep malicious software from secretly starting to work. If you work with Vista you may have noticed a lot of dialogs that ask you for permission to run certain programs. THAT is UAC.
Now the win32manifest inside an application tells the OS what level of permission it needs to run. There are levels like “asInvoker” (default), “highestAvailable” or “requireAdministrator”. In case the OS (and the current user) can provide the requested permission, the program runs as a trusted process (according to the granted access level). Of course, 90+% (just my estimation) of the programs on you Vista-machine don’t have a win32mianifest, because it’s a new feature and older applications just don’t have it. (And I think a majority of newer applications won’t have it either). In order to be compatible with no-win32manifest applications on one hand and providing security on the other hand, Vista can do a miracle called “virtualization”. It’s a bit like running a process in a sandbox: Vista provides anything the process may need to run: disk space, memory, a registry etc. So the process feels cosy.
But anything provided this way is separated from the main system. The process uses a copy of the registry, writes to a special parts of the HDD drives and accesses the memory in a controlled way (not sure about the memory thing…). For me this is a great achievement of Vista, almost magic. I can’t tell in detail how it works (mainly because I don’t have a clue), but it works good and doesn’t cost notable performance.
So on your Vista machine you have processes that run virtualized and others that run non-virtualized. “Non-virtualized” is also called “UCA-compatible” or just “compatible”, so I’ll adopt this wording here.
You could say that Vista trusts compatible processes a little bit more than virtualized processes. That doesn’t mean virtualized processes are evil, but.. well… maybe… under certain circumstances… not 100% trustworthy. (Here is another nice article)
Note three facts here:
- The state (virtualized/compatible) always counts for a process and is set when the process starts. When you start an .exe-file, Vista decides how to run it.
- Because of 1) there is no sense in attaching a win32manifest to components that do not start a process, like .dlls. Thus .dlls don’t have a win32manifest.
- The state (virtualized/compatible) of a process doesn’t change during its lifetime.
So if a .dll is used by a virtualized process, its code also runs virtualized. And if the same .dll is used by a compatible process, its code runs compatible.
With some of that in my mind, I could explain the problems we had with our interface software arisan: arisan is a .dll you reference in your .NET-application and use it. As mentioned above arisan calls a native (so non-.NET) .dll using p/invoke-interop-techniques. Now this native .dll starts a bunch of new processes: a database-server, a database-driver, some middle-layers, etc. I am pretty sure non of this newly started processes is based on applications with a win32manifest – so they all run virtualized.
But the .NET-applications we use to test the arisan.dll started running compatible as we switched to VS2008 and the 3.5-compiler. So our test application (compatible) called arisan.dll (compatible), arisan.dll (compatible) called C-Api.dll (compatible). C-Api.dll started new processes (virtualized !!) and called functions there. And in the end a compatible (fully trusted) process asked a virtualized (less trusted) process to fill some pointers… and that’s the point where Vista interferes and says: “Dudes, are you drunk? No way!”
It was a bit of bad luck that the clash of a virtualized and a compatible process happened so deep inside a software we couldn’t debug and only threw some strange, meaningless error-messages, but on the other hand Vista could have told us more. Maybe there is some place in Vista where the clash is logged. If someone knows – please let me know, too. Let me get this straight: Vista is absolutely right here, but I’d wish more information when it happens. I don’t exactly know how, but you could do harmful things if you could easily exchange all sorts of pointers between virtualized and compatible processes…
So finally the solution was easy. Both processes have either to run virtualized or non-virtualized. Since we can’t make a third-party software run compatible, we have to make our .NET-application run virtualized: We just need to prevent .NET from inserting a win32manifest to executables.
Using the compiler from the command-line the compiler-flag /nowin32manifest does it.
In VS2008 same switch can be found on the “Application”-tab of the project properties.
The manifest-area is enabled for ‘console applications’ and ‘Windows applications’. It is disabled for class libraries.
Another way to handle this problem is to switch off UAC completely. But we strictly advice you against this solution. UAC is one of Vista’s main features and it is designed to provide shelter for your precious data in times of evil viruses and brutal intrusion attempts
Epilog- bad fix
After we understood the problem, we saw that the ‘fix’ we provided as V1.1.1 couldn’t do much. Ok – it made the demos run, because the executable compiled with the 2.0- come without a win32manifest and thus run virtualized. But users trying it with VS2008 still had the same issues.
Actually we cannot provide a technical solution with the arisan.dll, because if the running mode depends on the executables that use arisan. So we’ll write detailed information and everyone lived happily ever after.