Snitz Forums 2000
Snitz Forums 2000
Home | Profile | Register | Active Topics | Members | Search | FAQ
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 Snitz Forums 2000 DEV-Group
 DEV Internationalization (v4)
 Something around LCID
 Forum Locked
 Printer Friendly
Author Previous Topic Topic Next Topic  

MedvidekPU
Starting Member

Czech Republic
6 Posts

Posted - 03 January 2003 :  04:26:59  Show Profile  Visit MedvidekPU's Homepage
Bozden, i have read some your notes about using LCID and i have to warn you - this is useless thing (especially in case of Czech language webs) and Microsoft screwed it completely and on some IIS versions it's even not working at all.

The only real solution, which can be found on any .cz web running iis, is adding appropriate META with charset definition.

it's even worse, any czech page running in IIS should have added one more meta.

<meta http-equiv="Content-Language" content="cs" >
<meta http-equiv="Content-Type" content="text/html; charset=windows-1250" >

those two are absolute MUST

And if I may suggest, it would be neccessary to add support for this to v4

Daniel<

Deleted
deleted

4116 Posts

Posted - 03 January 2003 :  06:21:18  Show Profile
LCID is used to differentiate between locales, a server support for them are not a must. Although LCID seems to be MS only, Snitz also supports non-IIS platforms. On the other hand, having the support will make the date-time format conversion quicker, so we are investigating the ways to deal with them also when available.

For our purposes, a country and/or language code is not enough because they don't give complete control over possibilities. Anyway, we use that LCID coding because it is most advanced.

"Content-Type" is already part of the locale and your advice about "Content-Language" has been noted .

Two Q thou:

1) Can you tell me what happens when "Content-Type" is specified but "Content-Language" is not specified?
2) How does Czech language behave under UTF-8?
<

Stop the WAR!
Go to Top of Page

n/a
deleted

593 Posts

Posted - 03 January 2003 :  15:44:52  Show Profile
A little side comment on this topic... I think there is some inherent problem in converting Windows 1250 charsets to UTF-8. I just tried with Slovakia lang file (as I read that there was a lang file translation in Czech from Slovak, but some charsets are not converting correctly. This also happened with Arabic originally composed in Windows 1250. I suspect that some of these languages needs to be created in unicode (UTF-8) itself as someone did in Georgian. Some of these must be special chars which may not map into unicode well....

You can take a look at this at i2Asia.....

<

Taku
Go to Top of Page

Tmpj
Junior Member

Denmark
467 Posts

Posted - 03 January 2003 :  16:16:45  Show Profile
I have tryed Snitz on many Linux server with either iASP or Chili!ASP and it worked fine!
But if I couldn't get other than linux for a website maybe I would have used phpBB or InvisionBoard.<
Go to Top of Page

n/a
deleted

593 Posts

Posted - 03 January 2003 :  17:32:31  Show Profile
There is a forum running on Linux with Chili!ASP with Shift-JIS Japanese also and all email compoents working now with Japanese etc., including handling file object ...... check:
http://www.airwork.jp/forum/default.asp<

Taku
Go to Top of Page

n/a
deleted

593 Posts

Posted - 03 January 2003 :  22:14:12  Show Profile
Well, using Notepad XP, could save/convert Slovak 1051 (Windows 1250) into utf-8 and seems no char corruptions there.... (no more box nor numeric char showing up for special chars)... and assume, along with other exotic langs deployed in utf-8, Czech lang probably will render fine in utf-8....

quote:
Originally posted by LeoRat

A little side comment on this topic... I think there is some inherent problem in converting Windows 1250 charsets to UTF-8. I just tried with Slovakia lang file (as I read that there was a lang file translation in Czech from Slovak, but some charsets are not converting correctly. This also happened with Arabic originally composed in Windows 1250. I suspect that some of these languages needs to be created in unicode (UTF-8) itself as someone did in Georgian. Some of these must be special chars which may not map into unicode well....

You can take a look at this at i2Asia.....



<

Taku

Edited by - n/a on 03 January 2003 22:48:20
Go to Top of Page

MedvidekPU
Starting Member

Czech Republic
6 Posts

Posted - 04 January 2003 :  06:58:06  Show Profile  Visit MedvidekPU's Homepage
Well, there are two issues.

First, LCID is server related, while META is client related.

For clients having correctly WIndows-1250 czech charset on display and on input, the META charset is mandatory - it won't work properly for most of the browsers. In theory, LCID should set proper "coding" of HTTP, but it's only theory and it is screwed - which means, that setting up LCID does not mean that there will be correct Content-Type in HTTP headers.

The other META, content, is needed for something else - for fulltexting engines, because that's the only possibility they will see that content is in czech language. Microsoft Index Server, for example, will NOT index any pages missing this META. And AFAIK it is used by, for example, GOOGLE to see the language page is written. And of course it is used by czech fulltexting engines (and we have at least five of them).

Czech charset (not language <grin>) works with UTF-8 but it is not used as support for UTF-8 is only in new browsers and only in few servers. IIS is also very weak in supporting UTF-8 and ASP pages cannot be written in UTF-8 anyway, so there is no reason for UTF-8.

For lot of fun - current SNITZ FORUM (v3) i have installed is setting LCID to 1033 - and it, currently, screws whole IIS as well as ALL of other ASPs on the same server (in different directories and apps) will start using some strange sort of DATE/TIME format (it swaps day/month) - and all input forms using dates will unbehave (for example, 2nd January placed in form will result in 1st February entered into SQL database). Czechs use alwayy DD/MM/RRRR ....

For IIS czech developers never rely on date/time conversions built in as this is unreliable (IIS is unable to keep correct setting over the time and login of operator to console is enough to change format) - we always write own conversion from "char" format to "date" format (and vice versa).<
Go to Top of Page

bjlt
Senior Member

1144 Posts

Posted - 04 January 2003 :  07:54:23  Show Profile
one reason for utf-8 is that someone may need to support multiple languages on the same page, and utf-8 seems to be the only solution yet. e.g. publishing a Czech - Japanese dictionary.<
Go to Top of Page

MedvidekPU
Starting Member

Czech Republic
6 Posts

Posted - 04 January 2003 :  08:38:56  Show Profile  Visit MedvidekPU's Homepage
quote:
Originally posted by bjlt

one reason for utf-8 is that someone may need to support multiple languages on the same page, and utf-8 seems to be the only solution yet. e.g. publishing a Czech - Japanese dictionary.



100% agree with only one problem - ASP files CANNOT be written in UTF-8 as ASP will not be able to process them. The onyl place where UTF-8 can be used are SQL queries.

<
Go to Top of Page

ruirib
Snitz Forums Admin

Portugal
26364 Posts

Posted - 04 January 2003 :  09:15:18  Show Profile  Send ruirib a Yahoo! Message
IIS 5.0+ handles UTF-8 ASP pages without problems. At least I never had a problem using them like that. I've tested it both in IIS 5.0 and 5.1.<


Snitz 3.4 Readme | Like the support? Support Snitz too
Go to Top of Page

n/a
deleted

593 Posts

Posted - 04 January 2003 :  20:10:48  Show Profile
quote:
Originally posted by MedvidekPU

.....
For lot of fun - current SNITZ FORUM (v3) i have installed is setting LCID to 1033 - and it, currently, screws whole IIS as well as ALL of other ASPs on the same server (in different directories and apps) will start using some strange sort of DATE/TIME format (it swaps day/month) - and all input forms using dates will unbehave (for example, 2nd January placed in form will result in 1st February entered into SQL database). Czechs use alwayy DD/MM/RRRR ....
....




Assume there is similar issues as Japanese have - 3 main encoding systems/charsets - Shift-JIS, EUC, and UTF8, and predominantly Shift-JIS for the same type reasons you mentioned....and they are not really compatible...all they share LCID=1041 though.

About your using 1033... am wondering whether you may be setting up basic problems by using 1033 rather than an appropriate Czech LCID? Isn't date/time format on server side esp, are determined by how you set a basic codepage and LCID ???? Seems like you have the same issues as many tried to resolve with additional international date formatting to V3.3.x (you can find lots of discussions and some fixes over the past year or so.) These issues have been reviewed and some key design considerations are given to them in V4b04 I think.

Hope you find a good solution and approach to address your issue....<

Taku
Go to Top of Page

Deleted
deleted

4116 Posts

Posted - 04 January 2003 :  21:33:53  Show Profile
quote:
Originally posted by MedvidekPU


First, LCID is server related, while META is client related.


Exactly, but we use LCID value as an index to a table about language specific data, which also delivers those meta tags values etc. So our use of it is not restricted to the server site only, in fact our primary concern is the client side.

quote:

For clients having correctly WIndows-1250 czech charset on display and on input, the META charset is mandatory - it won't work properly for most of the browsers. In theory, LCID should set proper "coding" of HTTP, but it's only theory and it is screwed - which means, that setting up LCID does not mean that there will be correct Content-Type in HTTP headers.

The other META, content, is needed for something else - for fulltexting engines, because that's the only possibility they will see that content is in czech language. Microsoft Index Server, for example, will NOT index any pages missing this META. And AFAIK it is used by, for example, GOOGLE to see the language page is written. And of course it is used by czech fulltexting engines (and we have at least five of them).


Thank you for this valuable info.

quote:

Czech charset (not language <grin>) works with UTF-8 but it is not used as support for UTF-8 is only in new browsers and only in few servers. IIS is also very weak in supporting UTF-8 and ASP pages cannot be written in UTF-8 anyway, so there is no reason for UTF-8.


We use utf-8 in only language files and it worked until now with IIS 5+ servers at least. We will not reduce our target area of the servers and databases, but we know that some of the databases (MS SQL Server 6.5, Access '97) also do not support unicode. So use of utf-8 will be a kind of "optional" who has the support of compatible databases and servers. We know that unicode support is not fully mature (e.g. also e-mail components have problems with them), but we also know that it is the way to go for our multi-language support and internationalization/localization efforts.

quote:

For lot of fun - current SNITZ FORUM (v3) i have installed is setting LCID to 1033 - and it, currently, screws whole IIS as well as ALL of other ASPs on the same server (in different directories and apps) will start using some strange sort of DATE/TIME format (it swaps day/month) - and all input forms using dates will unbehave (for example, 2nd January placed in form will result in 1st February entered into SQL database). Czechs use alwayy DD/MM/RRRR ....

For IIS czech developers never rely on date/time conversions built in as this is unreliable (IIS is unable to keep correct setting over the time and login of operator to console is enough to change format) - we always write own conversion from "char" format to "date" format (and vice versa).


Snitz also does not rely on server way of handling the date-time and it keeps and processes the date-time info in its special format. So will the v4. Currently we can even have Japanese date formats on English only servers. It in fact supports any date-time format on any server, as the code somewhat "interprets" the date-time and formats it, although it is obviously slower than using the direct use of server based formats.
<

Stop the WAR!
Go to Top of Page

MedvidekPU
Starting Member

Czech Republic
6 Posts

Posted - 05 January 2003 :  11:10:09  Show Profile  Visit MedvidekPU's Homepage
quote:
Originally posted by ruirib

IIS 5.0+ handles UTF-8 ASP pages without problems. At least I never had a problem using them like that. I've tested it both in IIS 5.0 and 5.1.



IIS 5.0 does NOT handle UNICODE ASP files. It's even documented in Knowledge Base and you can see it here :

Active Server Pages error 'ASP 0239'
Cannot process file
/rss/testunicode.asp, line 1
UNICODE ASP files are not supported.

So i might be crude to ask, but please do not mislead people. Thx.

Daniel<
Go to Top of Page

Deleted
deleted

4116 Posts

Posted - 05 January 2003 :  15:18:49  Show Profile
I'm running it on IIS 5 (w2k advanced server), it works OK with utf-8. And yes, FSO works only with normal unicode (not utf-8).

The KB article Unicode Code Page Not Supported in Internet Information Server talks of UNICODE not utf-8.

There are some loosely coupled utf-8 problems solved in SP3 as explained here: http://support.microsoft.com/default.aspx?scid=/support/servicepacks/windows/2000/sp3fixlist.asp

It would be nice if you put some evidence from the KB articles not to mislead people .
<

Stop the WAR!
Go to Top of Page

ruirib
Snitz Forums Admin

Portugal
26364 Posts

Posted - 05 January 2003 :  18:38:24  Show Profile  Send ruirib a Yahoo! Message
quote:
Originally posted by MedvidekPU



So i might be crude to ask, but please do not mislead people. Thx.

Daniel


Maybe you should be the one doing this in the first place.

I have a rule, which comes from my profession: if I'm not sure about something I say that explicitly. What I wrote before, I am sure about it, because I have done it and so have other people here. Just because I was sure about it, I wrote it. I need no advice from you about misleading anyone, ok? I have the utmost care about what I write, and I'm the first to recognize my mistakes, when I make then, and this is not definately the case. Can you say the same?<


Snitz 3.4 Readme | Like the support? Support Snitz too
Go to Top of Page
  Previous Topic Topic Next Topic  
 Forum Locked
 Printer Friendly
Jump To:
Snitz Forums 2000 © 2000-2021 Snitz™ Communications Go To Top Of Page
This page was generated in 0.16 seconds. Powered By: Snitz Forums 2000 Version 3.4.07