Author |
Topic  |
|
MedvidekPU
Starting Member
Czech Republic
6 Posts |
Posted - 03 January 2003 : 04:26:59
|
Bozden, i have read some your notes about using LCID and i have to warn you - this is useless thing (especially in case of Czech language webs) and Microsoft screwed it completely and on some IIS versions it's even not working at all.
The only real solution, which can be found on any .cz web running iis, is adding appropriate META with charset definition.
it's even worse, any czech page running in IIS should have added one more meta.
<meta http-equiv="Content-Language" content="cs" > <meta http-equiv="Content-Type" content="text/html; charset=windows-1250" >
those two are absolute MUST
And if I may suggest, it would be neccessary to add support for this to v4
Daniel< |
|
Deleted
deleted
    
4116 Posts |
Posted - 03 January 2003 : 06:21:18
|
LCID is used to differentiate between locales, a server support for them are not a must. Although LCID seems to be MS only, Snitz also supports non-IIS platforms. On the other hand, having the support will make the date-time format conversion quicker, so we are investigating the ways to deal with them also when available.
For our purposes, a country and/or language code is not enough because they don't give complete control over possibilities. Anyway, we use that LCID coding because it is most advanced.
"Content-Type" is already part of the locale and your advice about "Content-Language" has been noted .
Two Q thou:
1) Can you tell me what happens when "Content-Type" is specified but "Content-Language" is not specified? 2) How does Czech language behave under UTF-8? < |
Stop the WAR! |
 |
|
n/a
deleted
  
593 Posts |
Posted - 03 January 2003 : 15:44:52
|
A little side comment on this topic... I think there is some inherent problem in converting Windows 1250 charsets to UTF-8. I just tried with Slovakia lang file (as I read that there was a lang file translation in Czech from Slovak, but some charsets are not converting correctly. This also happened with Arabic originally composed in Windows 1250. I suspect that some of these languages needs to be created in unicode (UTF-8) itself as someone did in Georgian. Some of these must be special chars which may not map into unicode well....
You can take a look at this at i2Asia.....
< |
Taku
|
 |
|
Tmpj
Junior Member
 
Denmark
467 Posts |
Posted - 03 January 2003 : 16:16:45
|
I have tryed Snitz on many Linux server with either iASP or Chili!ASP and it worked fine!  But if I couldn't get other than linux for a website maybe I would have used phpBB or InvisionBoard.< |
 |
|
n/a
deleted
  
593 Posts |
Posted - 03 January 2003 : 17:32:31
|
There is a forum running on Linux with Chili!ASP with Shift-JIS Japanese also and all email compoents working now with Japanese etc., including handling file object ...... check: http://www.airwork.jp/forum/default.asp< |
Taku
|
 |
|
n/a
deleted
  
593 Posts |
Posted - 03 January 2003 : 22:14:12
|
Well, using Notepad XP, could save/convert Slovak 1051 (Windows 1250) into utf-8 and seems no char corruptions there.... (no more box nor numeric char showing up for special chars)... and assume, along with other exotic langs deployed in utf-8, Czech lang probably will render fine in utf-8....
quote: Originally posted by LeoRat
A little side comment on this topic... I think there is some inherent problem in converting Windows 1250 charsets to UTF-8. I just tried with Slovakia lang file (as I read that there was a lang file translation in Czech from Slovak, but some charsets are not converting correctly. This also happened with Arabic originally composed in Windows 1250. I suspect that some of these languages needs to be created in unicode (UTF-8) itself as someone did in Georgian. Some of these must be special chars which may not map into unicode well....
You can take a look at this at i2Asia.....
< |
Taku
|
Edited by - n/a on 03 January 2003 22:48:20 |
 |
|
MedvidekPU
Starting Member
Czech Republic
6 Posts |
Posted - 04 January 2003 : 06:58:06
|
Well, there are two issues.
First, LCID is server related, while META is client related.
For clients having correctly WIndows-1250 czech charset on display and on input, the META charset is mandatory - it won't work properly for most of the browsers. In theory, LCID should set proper "coding" of HTTP, but it's only theory and it is screwed - which means, that setting up LCID does not mean that there will be correct Content-Type in HTTP headers.
The other META, content, is needed for something else - for fulltexting engines, because that's the only possibility they will see that content is in czech language. Microsoft Index Server, for example, will NOT index any pages missing this META. And AFAIK it is used by, for example, GOOGLE to see the language page is written. And of course it is used by czech fulltexting engines (and we have at least five of them).
Czech charset (not language <grin>) works with UTF-8 but it is not used as support for UTF-8 is only in new browsers and only in few servers. IIS is also very weak in supporting UTF-8 and ASP pages cannot be written in UTF-8 anyway, so there is no reason for UTF-8.
For lot of fun - current SNITZ FORUM (v3) i have installed is setting LCID to 1033 - and it, currently, screws whole IIS as well as ALL of other ASPs on the same server (in different directories and apps) will start using some strange sort of DATE/TIME format (it swaps day/month) - and all input forms using dates will unbehave (for example, 2nd January placed in form will result in 1st February entered into SQL database). Czechs use alwayy DD/MM/RRRR ....
For IIS czech developers never rely on date/time conversions built in as this is unreliable (IIS is unable to keep correct setting over the time and login of operator to console is enough to change format) - we always write own conversion from "char" format to "date" format (and vice versa).< |
 |
|
bjlt
Senior Member
   
1144 Posts |
Posted - 04 January 2003 : 07:54:23
|
one reason for utf-8 is that someone may need to support multiple languages on the same page, and utf-8 seems to be the only solution yet. e.g. publishing a Czech - Japanese dictionary.< |
 |
|
MedvidekPU
Starting Member
Czech Republic
6 Posts |
Posted - 04 January 2003 : 08:38:56
|
quote: Originally posted by bjlt
one reason for utf-8 is that someone may need to support multiple languages on the same page, and utf-8 seems to be the only solution yet. e.g. publishing a Czech - Japanese dictionary.
100% agree with only one problem - ASP files CANNOT be written in UTF-8 as ASP will not be able to process them. The onyl place where UTF-8 can be used are SQL queries.
< |
 |
|
ruirib
Snitz Forums Admin
    
Portugal
26364 Posts |
|
n/a
deleted
  
593 Posts |
Posted - 04 January 2003 : 20:10:48
|
quote: Originally posted by MedvidekPU
..... For lot of fun - current SNITZ FORUM (v3) i have installed is setting LCID to 1033 - and it, currently, screws whole IIS as well as ALL of other ASPs on the same server (in different directories and apps) will start using some strange sort of DATE/TIME format (it swaps day/month) - and all input forms using dates will unbehave (for example, 2nd January placed in form will result in 1st February entered into SQL database). Czechs use alwayy DD/MM/RRRR .... ....
Assume there is similar issues as Japanese have - 3 main encoding systems/charsets - Shift-JIS, EUC, and UTF8, and predominantly Shift-JIS for the same type reasons you mentioned....and they are not really compatible...all they share LCID=1041 though.
About your using 1033... am wondering whether you may be setting up basic problems by using 1033 rather than an appropriate Czech LCID? Isn't date/time format on server side esp, are determined by how you set a basic codepage and LCID ???? Seems like you have the same issues as many tried to resolve with additional international date formatting to V3.3.x (you can find lots of discussions and some fixes over the past year or so.) These issues have been reviewed and some key design considerations are given to them in V4b04 I think.
Hope you find a good solution and approach to address your issue....< |
Taku
|
 |
|
Deleted
deleted
    
4116 Posts |
Posted - 04 January 2003 : 21:33:53
|
quote: Originally posted by MedvidekPU
First, LCID is server related, while META is client related.
Exactly, but we use LCID value as an index to a table about language specific data, which also delivers those meta tags values etc. So our use of it is not restricted to the server site only, in fact our primary concern is the client side.
quote:
For clients having correctly WIndows-1250 czech charset on display and on input, the META charset is mandatory - it won't work properly for most of the browsers. In theory, LCID should set proper "coding" of HTTP, but it's only theory and it is screwed - which means, that setting up LCID does not mean that there will be correct Content-Type in HTTP headers.
The other META, content, is needed for something else - for fulltexting engines, because that's the only possibility they will see that content is in czech language. Microsoft Index Server, for example, will NOT index any pages missing this META. And AFAIK it is used by, for example, GOOGLE to see the language page is written. And of course it is used by czech fulltexting engines (and we have at least five of them).
Thank you for this valuable info.
quote:
Czech charset (not language <grin>) works with UTF-8 but it is not used as support for UTF-8 is only in new browsers and only in few servers. IIS is also very weak in supporting UTF-8 and ASP pages cannot be written in UTF-8 anyway, so there is no reason for UTF-8.
We use utf-8 in only language files and it worked until now with IIS 5+ servers at least. We will not reduce our target area of the servers and databases, but we know that some of the databases (MS SQL Server 6.5, Access '97) also do not support unicode. So use of utf-8 will be a kind of "optional" who has the support of compatible databases and servers. We know that unicode support is not fully mature (e.g. also e-mail components have problems with them), but we also know that it is the way to go for our multi-language support and internationalization/localization efforts.
quote:
For lot of fun - current SNITZ FORUM (v3) i have installed is setting LCID to 1033 - and it, currently, screws whole IIS as well as ALL of other ASPs on the same server (in different directories and apps) will start using some strange sort of DATE/TIME format (it swaps day/month) - and all input forms using dates will unbehave (for example, 2nd January placed in form will result in 1st February entered into SQL database). Czechs use alwayy DD/MM/RRRR ....
For IIS czech developers never rely on date/time conversions built in as this is unreliable (IIS is unable to keep correct setting over the time and login of operator to console is enough to change format) - we always write own conversion from "char" format to "date" format (and vice versa).
Snitz also does not rely on server way of handling the date-time and it keeps and processes the date-time info in its special format. So will the v4. Currently we can even have Japanese date formats on English only servers. It in fact supports any date-time format on any server, as the code somewhat "interprets" the date-time and formats it, although it is obviously slower than using the direct use of server based formats. < |
Stop the WAR! |
 |
|
MedvidekPU
Starting Member
Czech Republic
6 Posts |
Posted - 05 January 2003 : 11:10:09
|
quote: Originally posted by ruirib
IIS 5.0+ handles UTF-8 ASP pages without problems. At least I never had a problem using them like that. I've tested it both in IIS 5.0 and 5.1.
IIS 5.0 does NOT handle UNICODE ASP files. It's even documented in Knowledge Base and you can see it here :
Active Server Pages error 'ASP 0239' Cannot process file /rss/testunicode.asp, line 1 UNICODE ASP files are not supported.
So i might be crude to ask, but please do not mislead people. Thx.
Daniel< |
 |
|
Deleted
deleted
    
4116 Posts |
|
ruirib
Snitz Forums Admin
    
Portugal
26364 Posts |
Posted - 05 January 2003 : 18:38:24
|
quote: Originally posted by MedvidekPU
So i might be crude to ask, but please do not mislead people. Thx.
Daniel
Maybe you should be the one doing this in the first place. I have a rule, which comes from my profession: if I'm not sure about something I say that explicitly. What I wrote before, I am sure about it, because I have done it and so have other people here. Just because I was sure about it, I wrote it. I need no advice from you about misleading anyone, ok? I have the utmost care about what I write, and I'm the first to recognize my mistakes, when I make then, and this is not definately the case. Can you say the same?< |
Snitz 3.4 Readme | Like the support? Support Snitz too |
 |
|
|
Topic  |
|