Character set issues


Not sure if this is a Postman issue but would like some help ruling it out.

So, I am storing data into an MS SQL Server 2008 database using a SQLXML updategram.

SQLXML provides a RESTful-like XML interface to SQL server.

For my test purposes, I have a table ‘PILongTextTest’ with the following structure
[LocationLong] nvarchar NULL,
[DateCreated] [smalldatetime] NOT NULL,
[LocID] [int] IDENTITY(1,1) NOT NULL

I have an updategram with the following structure
<intros xmlns:updg=“urn:schemas-microsoft-com:xml-updategram” xmlns:sql=“urn:schemas-microsoft-com:xml-sql”>
<updg:param name=“piuid”>1</updg:param>
<updg:param name=“loc_long”>999999</updg:param>
<PILongTextTest PIUID=“$piuid”
SELECT 1 as Tag,
NULL as parent,
PIUID as [PILongTextTest!1!PIUID],
LocationLong as [PILongTextTest!1!LocationLong!xml],
DateCreated as [PILongTextTest!1!DateCreated],
LocID as [PILongTextTest!1!LLocID]
FROM PILongTextTest
WHERE PIUID = @piuid
FOR XML Explicit

The updategram is accessed using this URI in Postman

A depth of 3.0 m.

I would expect this to create a database record, and return an xml copy of that record, which it does

If I change the data to add a non-breaking space, the data seems to be stored correctly, but when return to Postman, is not displaying ‘correctly’. There is an ‘odd’ Â character in front of the space

I believe this is a marker in the character encoding to indicate the use of UTF-8.

I have tested Postman similarly with other extended characters with similar results.

I believe my use of nvarchar in the table definition means that the table is capable of storing UTF-8 correctly.

So my questions are:

  1. Is that data being stored correctly in the database, and how can I test this? (perhaps outside the scope of this forum)
  2. On the assumption that it is stored correctly, is Postman displaying these ‘extended’ characters as expected? (SQL Server Management Studio displays similarly)
  3. Is there a way I can change this behaviour, perhaps in the http header (this IS Postman relevant) and requests?

Any suggestions?

@crispness This is a really good question.
I think we would need to answer Question 1 before considering the others.

Is there a way for you to query the DB directly? or you could also make this call via command line using cURL.

This would tell us where the issue lies and then we can move forward with troubleshooting if needed.

Well, the only query I can think of might be something like locates the position of the extended character in the text stream and asks SQL what is the character code for that character. But I’m afraid I don’t understand the underlying structures well enough to understand if the returned value accurately reflects the actual character code of that character or whether sql server sees it as 2 separate bytes of data in the nvarchar field.

I’m not sure what that would prove.

Do you have any other suggestion of how I might query the DB directly?

I haveen’t tried cURL, but I have used other command line methods with similar results. Thats what brought me back to Postman as the ability to script and test and vary the approach seemed a better testing platform. Certainly copy/pasting the full URL into a browser works similarly. And using the document() function in XSLT does as well (the original source of the data was a set of XML files and I was using XSLT to extract from the XML files and post to SQL. Postman simplifies the process in my eyes.)

So you are saying that you are seeing the char show up when using other command line utils?

Yes, it is always there no matter how I try to store the data

Ok, so in that case I think we can rule out Postman being an issue here. Correct?