1

Closed

Properties on data contracts of type Int32 (exposed as object) are deserialized as Int64

description

Here's a snippet to reproduce the issue, numbers in json are just numbers but bson has a specific code for storing Int32 values. It should be able to retain the type information regardless.
 
class DataContract
{
public object Value { get; set; }
}
 
private static void Main3()
{
var outputStream = new MemoryStream();
var serializer = new JsonSerializer();
serializer.Serialize(new BsonWriter(outputStream), new DataContract { Value = 123 }); // this value is put as an Int32 value, due to C# compiler specifics
outputStream.Position = 0;
var obj = serializer.Deserialize<DataContract>(new BsonReader(outputStream));
Debug.Assert(obj.Value is long); // I think this is wrong but the value has now changed type!
Debug.Assert(obj.Value as int? == null); // I was expecting to be able to do this but it doesn't work due to the change in type
}
 
I've made the necessary changes in both BsonReader and BsonWriter, for some reason, it appears as the value has always been stored as a Int32 value, but it is being type casted to a long. Funny enough, these changes doesn't appear to break existing behaviour (except those tests that I believe is asserting the wrong type).
 
If I have a property of type object and put an Int32 in it, it will round trip as Int32, if I put a Int64 value in there, it will round trip as a Int64 value. This is how I think it should work.
Closed Aug 1, 2011 at 3:57 AM by JamesNK

comments

leidegre wrote May 24, 2011 at 7:16 AM

I was poking around the source and found this piece of code (see ReadType in file BsonReader.cs) somewhat baffling.

case BsonType.Integer:
SetToken(JsonToken.Integer, (long)ReadInt32());
break;

Is that typcast really necessary because if it is causing the behaviour I'd remove it.

leidegre wrote May 24, 2011 at 7:38 AM

I went and made the change myself and rebuilt Json.NET and ran all the tests. Two tests now fail in BsonReaderTest. ReadSingleObject and WriteValues. These both assert that the type is long but that the value is an integer. I can't find a reason from looking at these tests why it has to be type of "long".

Making this change, removing the type cast here, makes the problem I've described go away. My take on this matter is that Json.NET should retain the type information, if I have a property of type Int32 it should round trip that value as a Int32 if possible and since I'm using the BSON format this is completely doable.

leidegre wrote May 24, 2011 at 7:42 AM

Okay, I downloaded the latest release source and made the change, re ran the test and two tests (in BsonReaderTest) now failed. ReadSingleObject and WriteValues. However, none of these tests seem to assert anything besides the type, they make the equality assertion that 1 (of type Int32) equals 1 (of type Int64) but I fail to see why it has to be Int64 from these tests so I suggest that we change this and remove the type cast. I think that Json.NET should roundtrip the type information when possible, I cannot think of a reason why this has to be like this, I would expect the type to be retained when possible.

leidegre wrote May 24, 2011 at 7:43 AM

OMG, this Issue Tracker is stupid, can't edit can't delete, sorry for double posting though, I though my first entry was lost. If you have moderation capabilities please clean this up a bit.

leidegre wrote May 25, 2011 at 6:55 AM

Given that JSON is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999 I have to assume that the memory model of JSON close relates or is actually that of the ECMA specification.

This leads to fact number 1. There is no notion of a 64-bit integer JSON. There is only the number data type.

A number is a 64-bit double-precision floating point number according to the IEEE 754 format. These numbers can represent integers up to 53-bits without the loss of precision.

Fact number 2 the bit arithmetic operations such as the arithmetic left shift are unsupported for values that don’t fit in a signed 32-bit integer. This leads me to conclude that the concept of an integer other than the 32-bit signed integer doesn’t exist.

The argument that it’s less likely to overflow doesn’t hold up in my opinion. Besides the most common data type is unquestionably, the 32-bit signed integer.

The JSON syntax only describes a numeric value however given how the value is encoded assumptions, that won’t harm your existing memory model, can be made about the underlying type e.g. any existing integer string less than or equal to 2147483647 can internally be represented as a 32-bit signed integer. Any value which doesn’t fall within this range has to be represented as a double. The CLR will then handle any necessary conversions for you.

The point I’m trying to make is that if the type information can be retained, it should and it shouldn’t involve 64-bit integers.

In terms of compatibility with the CLR, 64-bit integers are unsupported by JSON (just like date and time) and it should be treated as such.

Also, I could not find any tests that asserted this behavior in the test suite. I’ll gladly take it upon myself to look into adding this type of support (with your blessing) and extensive testing as to not break existing code. I feel that it is important to stay true to the type constraints of JavaScript/JSON but at the same time play nice with the .NET CLR.

JamesNK wrote May 25, 2011 at 6:57 AM

Json.NET by default reads integer values as Int64 because there is no way to know whether the value should be Int32 or Int64, and Int64 is less likely to overflow. For a typed property the deserializer knows to convert the Int64 to a Int32 but because your property is untyped you are getting an Int64.

That bit of source code you posted isn't a bug, I did that so that the BsonReader and JsonTextReader produce consistent results. It is just the way Json.NET has to work.



** Closed by JamesNK 2011-05-24 15:42

leidegre wrote May 25, 2011 at 6:57 AM

I made a remark about the necessity of a 64-bit integer I'd like you to review.

JamesNK wrote May 25, 2011 at 7:28 AM

I'd rather not have the underlying type of an integer change depending on its size.

Having Int64 all the time is only noticeable when deserializing to untyped properties, and it would still be weird if deserializing a list of numbers that some of them would be Int32 and others would be Int64 depending on their size.

leidegre wrote May 25, 2011 at 10:42 AM

The list situation is perfectly valid as JSON is dynamic, it's only the data contract which imposes a static type onto the dynamic object. I have to think about this some more but I still believe this should be handled differently. I cannot purpose a better change as of this moment but reset assured, I will get back to you on this.

leidegre wrote May 25, 2011 at 10:50 AM

For the record, the type would never change between Int32 and Int64 becuase Int64 is not a valid value. If you try to cram an Int64 value in a double (which is the only representation known to JSON) you will have rounding errors so there is a always chance that you lose precision when you involve Int64 values. That is why I first of all would like to treat them differently.

The integer is identified as any value which doesn't have a decimal part, i.e. "1.0" is not an integer but "1" is, the JsonTextWriter can do this and would never write a double as simply "1", it would always write "1.0" and with BSON you have the type information in the document. The type would never change depending on the magnitude of the value. This will effectivly allow you to preserve/round trip the type information regardless of using JSON or BSON as storage.