Author |
Topic |
|
MarcelG
Retired Support Moderator
Netherlands
2625 Posts |
Posted - 28 January 2009 : 15:58:33
|
It's been little over a month since I upgrade oxle from MS Access to SQL Server 2000. All seemed well, but today I tried to post the word #1070;#1087;#1080;#1090;#1077;#1088; (jupiter in Russian) and it came out ???????.
Strange thing however is that it did work: http://oxle.com/topic/4009.html still shows the site seems to be able to succesfully retrieve the data from the database, so it's still there. But, if I look in the database it does not show the contents of t_message for that topic......
I checked the MS knowledge base and all I found was this: http://support.microsoft.com/kb/232580
Is there something I should change in the way I connect to the SQL database, or should something be changed on the SQL server itself to enable this?< |
portfolio - linkshrinker - oxle - twitter |
Edited by - MarcelG on 28 January 2009 15:59:18 |
|
MarcelG
Retired Support Moderator
Netherlands
2625 Posts |
Posted - 29 January 2009 : 05:04:11
|
BTW: I noticed in the SQL Enterprise Manager that the database uses SQL_Latin1_General_CP1_CI_AS .... is that the cause? I also notice that here at Snitz UTF-8 is not supported at all ; is that just a fact, or the result of the decision not to support them?< |
portfolio - linkshrinker - oxle - twitter |
Edited by - MarcelG on 29 January 2009 07:26:34 |
|
|
ruirib
Snitz Forums Admin
Portugal
26364 Posts |
Posted - 29 January 2009 : 05:23:33
|
I'm not sure about the reason for your problem. I would expect that direct insertion of the characters would work, but I don't have enough experience to state that was the cuase for your isse.
A collation in a SQL Server database is used for mailing for sort ordering purposes, so it does not affect what's stored in Unicode (it defines the code page for non-Unicode data). More info on collations: http://msdn.microsoft.com/en-us/library/aa174903(SQL.80).aspx
Snitz is an english speaking forum, so no need to support UTF-8. I see no other reason than that, though this is just an assumption, no one told me this.< |
Snitz 3.4 Readme | Like the support? Support Snitz too |
|
|
MarcelG
Retired Support Moderator
Netherlands
2625 Posts |
Posted - 29 January 2009 : 07:02:35
|
Mmmm, I'm reading in on the collation stuff, and I must say it's interesting reading material. It's curious to see that Microsoft does not natively support UTF-8 as input. Source: http://support.microsoft.com/kb/232580
quote: UTF-8 is not valid character data to SQL Server
Perhaps the route I took with making Unicode available on oxle 2,5 years ago was completely wrong, but I'm a bit disappointed that there is no clear approach described anywhere on how to do this. This disappointment is not about Snitz at all, but more in general ; it's a global internet, but making your applications work globally for all languages seems to be nearly impossible. I assumed UTF-8 was the way to go, but that doesn't seem true when you're using MS SQL.< |
portfolio - linkshrinker - oxle - twitter |
|
|
HuwR
Forum Admin
United Kingdom
20584 Posts |
Posted - 29 January 2009 : 07:35:45
|
data should be stored in MS SQL as Unicode data, the character set (UTF-8) is not relevant for what is stored in the db, as long as you are storing unicode data you can store any language you want. There is nothing special about the setup here, yet it is possible to store any language in the message
the string you tried to post posts perfectly well here, and since you were posting hex values why do you think it is the db that issue, since the db should be storing the hex string not the actual characters (unless you pasted thecaharcters rather than the hex codes, so did you post the hex codes or actually cut and paste the chars from somewhere ?
< |
|
|
MarcelG
Retired Support Moderator
Netherlands
2625 Posts |
Posted - 29 January 2009 : 07:48:24
|
I wasn't posting hex values when I posted the russian word for Jupiter ; I was posting actual russian characters, and they were shown here as the hex values only *after* the post. Example: #1088;#1091;#1089;#1089;#1082;#1080;#1081;
Here's what the post window looks like:
In the preview here at Snitz it looks ok: (click for bigger version)
At oxle each character as a questionmark instead of a hex equivealent. That's a difference compared to here at Snitz.< |
portfolio - linkshrinker - oxle - twitter |
Edited by - MarcelG on 29 January 2009 07:49:22 |
|
|
HuwR
Forum Admin
United Kingdom
20584 Posts |
Posted - 29 January 2009 : 08:01:49
|
my guess is that you need to adjust the queries so that they are infact correctly storing the data as unicode, that is what the .Net version is doing and that has no problems displaying or saving foreign character sets (it is using exactly the same database server as the standard asp version, the only difference is the use of N to prefix the unicode strings in a query< |
|
|
MarcelG
Retired Support Moderator
Netherlands
2625 Posts |
Posted - 29 January 2009 : 08:08:43
|
Mmmm, I read something about that n prefix too. Is it feasible to enable that for the ASP version?< |
portfolio - linkshrinker - oxle - twitter |
|
|
HuwR
Forum Admin
United Kingdom
20584 Posts |
Posted - 29 January 2009 : 09:27:03
|
yes, it is SQL not ASP, just change the queries so that wherever you update T_MESSAGE or R_MESSAGE etc so that it has N in front of the string to be inserted/update.
for example in post_info.asp the reply update query around line ~360 would change from
strSql = "UPDATE " & strActivePrefix & "REPLY " strSql = strSql & " SET R_MESSAGE = '" & txtMessage & "'"
to
strSql = "UPDATE " & strActivePrefix & "REPLY " strSql = strSql & " SET R_MESSAGE = N'" & txtMessage & "'" < |
|
|
HuwR
Forum Admin
United Kingdom
20584 Posts |
Posted - 29 January 2009 : 09:27:48
|
you would do this for any string that needs to be unicode, ie message/subject etc< |
|
|
MarcelG
Retired Support Moderator
Netherlands
2625 Posts |
Posted - 29 January 2009 : 09:41:36
|
Mmmm, that's a lot of search and replace, but I'll give it a try. If I'm correct all the topic & reply related updates and inserts are done through post_info.asp. The other updates are done in the profiles (registration and profile edits)...this sounds feasible. Now just find some time to do this.
-edit- first test: http://oxle.com/topic/5707.html It works!!! Thanks Huw!
-second edit- It's done!! At least for forum (title/subject), topic (title, topic, reply). I'm not going to do the text fields of the profile page, nor am I going to support unicode usernames (that's just creepy...) I'll have to test some things, such as the split topic mod, but I think it's pretty easy now I know what I should change.< |
portfolio - linkshrinker - oxle - twitter |
Edited by - MarcelG on 29 January 2009 10:01:33 |
|
|
HuwR
Forum Admin
United Kingdom
20584 Posts |
Posted - 29 January 2009 : 09:59:04
|
I wouldn't try it using search and replace anyway, it is not as many as you think, there are only a handful of instances that need changing if that (you only need to change where T_MESSAGE,R_MESSAGE or T_SUBJECT are being inserted or updated.)< |
|
|
HuwR
Forum Admin
United Kingdom
20584 Posts |
Posted - 29 January 2009 : 09:59:58
|
as you suggest, only do posts, forget about the profile stuff.< |
|
|
MarcelG
Retired Support Moderator
Netherlands
2625 Posts |
Posted - 29 January 2009 : 10:04:40
|
Yup, there were not that many instances where I had to use the N' trick. Only 16 if I'm right. Now I can freely post the real russian name of the lens I just bought. Thanks Huw!!!!< |
portfolio - linkshrinker - oxle - twitter |
|
|
|
Topic |
|