1

Closed

Thread issue in Newtonsoft.Json.Serialization.DefaultMappingResolver

description

in "public virtual JsonMemberMappingCollection ResolveMappings(Type type)"
 
_typeMemberMappingsCache's TryGetValue is not thread safe, and should not be used outside of the lock.
 
If we only read the collection, then we are fine, but since there is a chance that we are modifying the collection in one thread (inside the lock) as we are reading it in another (outside the lock), data can get corrupted (reads can fail, or return incorrect data).
Closed Jul 30, 2009 at 11:17 AM by JamesNK

comments

JamesNK wrote Jun 10, 2009 at 5:01 AM

Did you get an error from this?

I'm not an expert when it comes to threading. Do you know what the best approach to handling locking in this class would be?

amirshim wrote Jun 10, 2009 at 11:10 PM

The likelihood of getting an error from this is low (as with many multi-threading bugs), but still possible, and should be fixed. It could return something completely wrong.

The simplest fix is to put all access (read and write) of _typeMemberMappingsCache inside the lock. This might have a performance penalty since each access has to acquire the lock.

Since the dictionary only gets added to and we don't care if an element goes missing (since it is invariant), another solution is to simply copy the entire dictionary everytime you add an element and replace the reference to it. If there is a race condition then the double try-read will take care of it.

so instead of: (inside ResolveMappings)
    TypeMemberMappingsCache[type] = memberMappings;
we will do something like:
    // copy the entire dictionary
    var typeMemberMappingsCacheCopy = new Dictionary<Type, JsonMemberMappingCollection>(TypeMemberMappingsCache);
    // add the one element to the copy
    typeMemberMappingsCacheCopy[type] = memberMappings;
    // replace the current one with the copy
    System.Threading.Interlocked.Exchange(ref TypeMemberMappingsCache, typeMemberMappingsCacheCopy);
Obviously, you will need to make the instance variable (TypeMemberMappingsCache) non-readonly.

Although this is not that efficient, it will stabilize as the cache gets populated. And after a while, only use the non-locked version.

To speed up the initial population, so we don't copy the dictionary on each add, we could do one of a couple of things. We could keep to versions: the one that is up-to-date and is only used under the lock and the copy. We then only make a copy after the locked version has been hit "X" number of times, or if it hasn't been updated to "X" amount of time. I think the latter is better since it doesn't care if you have 1 or 10000 types, but will start using the non-locked version after "X" seconds of not being updated (for example 100ms).

I can write a patch if you would like.

JamesNK wrote Jun 11, 2009 at 2:50 AM

Creating a new Dictionary each time sounds rather drastic.

Would a ReaderWriterLock be appropriate? I have worked on a project that used them but I wasn't familiar with that specific part of the code.

amirshim wrote Jun 11, 2009 at 4:33 AM

Theoretically yes, but you'll take a huge performance hit for your general use case, which is probably no/minimal threads (little contention). Even if you use ReaderWriterLockSlim, it gonna be slower (1.5x) on one thread then using a plain lock (i.e. my first solution of putting everything under the current lock that is already in place). See:
http://blogs.msdn.com/pedram/archive/2007/10/07/a-performance-comparison-of-readerwriterlockslim-with-readerwriterlock.aspx

Obviously, using as RW lock, you would gain performance (over the basic lock solution) for programs that serialize on many threads with high contention, but if that is the audience, then you might as well optimize for that too, by doing a copy solution and losing a bit of performance on program startup.

I agree that copying everytime seems a little overkill, but that's why you should do a timed based copy.

Even without it, since the copies happen pretty close to each other, the newly allocated dictionary will unlikely escalate to Gen1, so they they will be garbage collected almost immediately. It's a small price to pay up-front (program startup) for very fast (i.e. the fastest you can hope for) execution in the long run. There are only so many new types a program can want to serialize.

Just say the word, and I'll write the timed code that doesn't copy every time... it's not complicated, I just don't want to spend time on it if you have no intention what-so-ever of using it.

JamesNK wrote Jun 16, 2009 at 10:53 AM

I have implemented a ThreadSafeDictionaryWrapper class that hopefully resolves the threading issue. There are many caches in Json.NET and I wanted something generic that could be used for all of them.

It has been checked into the source code in CodePlex.

I'd be interested to know what you think :)



** Closed by JamesNK 6/16/2009 2:53 AM

JamesNK wrote Jun 16, 2009 at 10:53 AM

Opps, meant to only add that as a comment!

amirshim wrote Jun 16, 2009 at 3:20 PM

The Copy-On-Write code looks great. There are a few minor optimizations that I would think about, such as not calling _creator() under the lock and only copying the dictionary every X amount of time or X cache misses. But since all of this is related to program startup-cost, I am very happy with the current solution.
Thanks for the quick and prompt fix,
 Amir Shimoni