Author |
Topic  |
|
ILLHILL
Junior Member
 
Netherlands
341 Posts |
Posted - 12 July 2007 : 15:33:27
|
Hi, hope I will be able to explain this a little.
I recently ran into the following issue while parsing a specific news feed (http://rss.news.yahoo.com/rss/world). As you might see in the source of this rss, the content inside the title tags looks like this: (I replaced the spaces with underscores to illustrate this)
<title>U.S. troops raid Shiite district_ ____(AP) </title>
The problem is that they enter 5 whitespaces before they add the source (AP in the example).
Now this is where it becomes a problem for me. My parser caches feeds into ascii text format, which then results in this:

As you can see, the place where the line breaks in the rss is where in the textfile the []-"symbol" is created (is that called an interpunction?).
This []-symbol messes up the code. Links with titles which contain text and that symbol stop being links.
I experimented a bit with replace(string, "[]", " ") (where I copied and pasted the actual square shape) and I tried replace(string, " ", " ") both to no avail.
I searched google a bit, but all I can find is how to prevent the extra spaces from ending up in rss, but since I am not creating the rss, just parsing it, these things don't help for me.
Anybody has any experience with this? If you need more info let me know, if you want to see the bug in action (http://www.clppr.com - the upper left news cell holds the Yahoo World News feed on default, see how the actual link is just turned into text because of this).
Greets & thanks, Dominic
|
CLPPR.com - All The News Only Seconds Away |
|
PPSSWeb
Junior Member
 
312 Posts |
Posted - 12 July 2007 : 15:45:12
|
A google search turned up a couple uses of REG expressions that may be useful.
<%
Function RegExpReplace(Str, Pattern, Replacement)
Set objRegExp = New RegExp
objRegExp.Pattern = Pattern
objRegExp.Global = True
RegExpReplace = objRegExp.Replace(Str, Replacement)
Set objRegExp = Nothing
End Function
strTest = "abcd2""$$£"
strPattern = "[^A-Za-z 0-9 \.,\?'""!@#\$%\^&\*\(\)-_=\+;:<>\/\\\|\}\{\[\]`~]*"
strReplace = ""
response.write strTest
response.write "<br>"
response.write RegExpReplace(strTest, strPattern, strReplace)
%>
and the more simplified yet complicated
Set objRegExp = New RegExp
objRegExp.Global = True
objRegExp.IgnoreCase = True
objRegExp.Pattern = "[^\x20-\x7E]"
EditorialReview = objRegExp.Replace(EditorialReview,"")
I haven't tried them, but they are supposed to remove non-ascii characters from a string. |
 |
|
ILLHILL
Junior Member
 
Netherlands
341 Posts |
Posted - 12 July 2007 : 15:48:28
|
Wow...that might be it! I will try and feed back. Thanks for the help!
That was it! It's working!!!!
Thank you very much PPSSWeb!!!
Greets & thanks, Dominic
|
CLPPR.com - All The News Only Seconds Away |
Edited by - ILLHILL on 12 July 2007 16:46:01 |
 |
|
PPSSWeb
Junior Member
 
312 Posts |
Posted - 13 July 2007 : 07:45:00
|
Glad it worked for you.
Just so I know, which one did you use and did you have to change it at all? Like I said, I didn't test them, but I want to add one to my bag of tricks reference information in case I ever need it.
Thanks, Steve |
 |
|
ILLHILL
Junior Member
 
Netherlands
341 Posts |
Posted - 13 July 2007 : 12:53:16
|
I did only try the first one. (RegExpReplace(strTest, strPattern, strReplace)) My files for clppr contain an include with functions like Islike etc. I added the function to that include and added these two lines underneath (to prevent I have to call them from all the different files needed)
strPattern = "[^A-Za-z 0-9 \.,\?'""!@#\$%\^&\*\(\)-_=\+;:<>\/\\\|\}\{\[\]`~]*" strReplace = ""
I already added a lot of character replacing for the string before it reaches the point of the RegExpReplace.
So at that point it just looked like: <a href=""link"" title=""" & RegExpReplace(string, strPattern, strReplace) & """> and it worked like a charm from there on.
As for your bag of tricks, I would like to suggest to have a look here: http://www.livio.net/main/asp_functions.asp
I found some very handy things in there.
Greets & thanks, Dominic
|
CLPPR.com - All The News Only Seconds Away |
 |
|
AnonJr
Moderator
    
United States
5768 Posts |
Posted - 13 July 2007 : 13:24:43
|
Sweet link.  |
 |
|
PPSSWeb
Junior Member
 
312 Posts |
Posted - 13 July 2007 : 14:06:32
|
Thanks, that link will definitely come in handy.  |
 |
|
ILLHILL
Junior Member
 
Netherlands
341 Posts |
Posted - 15 July 2007 : 08:30:27
|
Glad I could at least, to some degree, give something back :) There's a ton of brilliant stuff in there with good explanations on how to use them :)
Greets& thanks, D |
CLPPR.com - All The News Only Seconds Away |
 |
|
|
Topic  |
|