In my current project we’re using XmlSerializer a lot. At some point I need to write about 400 files with the XmlSerializer. For some reason I created a new XmlSerializer for each of these files.
I discovered during debugging a lot of debug messages in Visual Studio output window like:
'Application.vshost.exe' (Managed (v4.0.30319)): Loaded 'm0dayvr5' 'Application.vshost.exe' (Managed (v4.0.30319)): Loaded 'oxxqw1rq' 'Application.vshost.exe' (Managed (v4.0.30319)): Loaded 'dgh3mgtl' 'Application.vshost.exe' (Managed (v4.0.30319)): Loaded '00sdpqlv' 'Application.vshost.exe' (Managed (v4.0.30319)): Loaded 'yokpozj4'
Then I remembered that the XmlSerializer creates dynamic assemblies during runtime and loaded them into the AppDomain (and never unloads these assemblies since you cannot unload from an AppDoamin). In the MSDN article I found a solution that explains this behavior:
Dynamically Generated Assemblies
To increase performance, the XML serialization infrastructure dynamically generates assemblies to serialize and deserialize specified types. The infrastructure finds and reuses those assemblies. This behavior occurs only when using the following constructors:
If you use any of the other constructors, multiple versions of the same assembly are generated and never unloaded, which results in a memory leak and poor performance. The easiest solution is to use one of the previously mentioned two constructors. Otherwise, you must cache the assemblies in a Hashtable, as shown in the following example.
Of course you can use any other data structure to do the caching. But you have to do it on your own if you do not use the two mentioned constructors. I’ve got no idea why Microsoft builds caching into the code for some constructors and not for others, but that’s the way it is.
Some performance measurements
I didn’t build a great benchmark, but at least I got a few numbers. Serializing multiple files without caching takes on my machine about 140ms in debug mode and 130ms in release mode. Each time. With caching only the first file takes this 140 / 130ms and every additional file took about 8ms (debug) or 6ms (release).
That’s a speedup of about 17x to 20x!
And you do not produce memory leaks!