Author |
Topic  |
|
hnadude
Starting Member
8 Posts |
Posted - 21 August 2003 : 18:06:07
|
I've been using the Microsoft.XMLHTTP object in ASP to grab HTML from a spanish language site, and it works perfectly well on most all the pages but one. (I am specifcally referring the Foreign Language Characters - PANAMÁ / Telefónica... etc.)
One page returns "?"s where there should be Foreign Language Characters like "Á".
I have a feeling it might be the way the site coded those chars on that particular page. i'll try to explain my thoughts but it's only a guess.
When I look at the raw html from the view source in IE, the pages that Microsoft.XMLHTTP renders PROPERLY have the chars coded like so "PANAMÁ"
When I look at the offending page's raw html from the view source in IE it looks like this "PANAMÁ".
Then, When I dump the Microsoft.XMLHTTP html output of the proper pages, i get "PANAMÁ" this is perfect!
When I dump the Microsoft.XMLHTTP html output of the bad page, i get "PANAM?" this is not :(, where & why the heck ? ?
Is that a clue?
also, I have tried all these - but I'm just shoot'n in the dark
oHTTP.setRequestHeader "Content-Type", "text/html"
oHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=EUC-KR"
oHTTP.setRequestHeader "Content-Type", "text/html; charset=windows-1252"
oHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
oHTTP.setRequestHeader "Content-Type", "text/html; charset=iso-8859-1"
- I have no solution to this - i'm stuck - help - anybody?
|
|
hnadude
Starting Member
8 Posts |
Posted - 21 August 2003 : 18:19:33
|
jeeze I just looked at my question and It converted my chars .... now that's gonna confuse the heck aout of every body... (list see if this is better)
Reposting this ....
I've been using the Microsoft.XMLHTTP object in ASP to grab HTML from a spanish language site, and it works perfectly well on most all the pages but one. (I am specifcally referring the Foreign Language Characters - PANAMÁ / Telefónica... etc.)
One page returns "?"s where there should be Foreign Language Characters like "Á".
I have a feeling it might be the way the site coded those chars on that particular page. i'll try to explain my thoughts but it's only a guess.
When I look at the raw html from the view source in IE, the pages that Microsoft.XMLHTTP renders PROPERLY have the chars coded like so "PANAM&A-a-c-u-t-e" (remove the dashes)
When I look at the offending page's raw html from the view source in IE it looks like this "PANAMÁ".
Then, When I dump the Microsoft.XMLHTTP html output of the proper pages, i get "PANAM&A-a-c-u-t-e" (remove the dashes) this is perfect!
When I dump the Microsoft.XMLHTTP html output of the bad page, i get "PANAM?" this is not :(, where & why the heck ? ?
Is that a clue?
also, I have tried all these - but I'm just shoot'n in the dark
oHTTP.setRequestHeader "Content-Type", "text/html"
oHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=EUC-KR"
oHTTP.setRequestHeader "Content-Type", "text/html; charset=windows-1252"
oHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
oHTTP.setRequestHeader "Content-Type", "text/html; charset=iso-8859-1"
- I have no solution to this - i'm stuck - help - anybody?
|
 |
|
|
Topic  |
|
|
|