Snitz Forums 2000
Snitz Forums 2000
Home | Profile | Register | Active Topics | Members | Search | FAQ
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 Community Forums
 Code Support: ASP (Non-Forum Related)
 Whitespace/interpunction error/bug in parsing
 New Topic  Topic Locked
 Printer Friendly
Author Previous Topic Topic Next Topic  

ILLHILL
Junior Member

Netherlands
341 Posts

Posted - 12 July 2007 :  15:33:27  Show Profile
Hi, hope I will be able to explain this a little.

I recently ran into the following issue while parsing a specific news
feed (http://rss.news.yahoo.com/rss/world).
As you might see in the source of this rss, the content inside the title tags looks like this:
(I replaced the spaces with underscores to illustrate this)

<title>U.S. troops raid Shiite district_
____(AP)
</title>

The problem is that they enter 5 whitespaces before they add the source (AP in the example).

Now this is where it becomes a problem for me.
My parser caches feeds into ascii text format,
which then results in this:


As you can see, the place where the line breaks in the rss is where
in the textfile the []-"symbol" is created (is that called an interpunction?).

This []-symbol messes up the code.
Links with titles which contain text and that symbol stop being links.

I experimented a bit with replace(string, "[]", " ") (where I copied and pasted the actual square shape) and I tried replace(string, " ", " ") both to no avail.

I searched google a bit, but all I can find is how to prevent the
extra spaces from ending up in rss, but since I am not creating the
rss, just parsing it, these things don't help for me.

Anybody has any experience with this?
If you need more info let me know, if you want to see the
bug in action (http://www.clppr.com - the upper left news cell holds
the Yahoo World News feed on default, see how the actual link is just
turned into text because of this).

Greets & thanks, Dominic



CLPPR.com - All The News Only Seconds Away

PPSSWeb
Junior Member

312 Posts

Posted - 12 July 2007 :  15:45:12  Show Profile
A google search turned up a couple uses of REG expressions that may be useful.

<%
	Function RegExpReplace(Str, Pattern, Replacement)
		Set objRegExp = New RegExp
		objRegExp.Pattern = Pattern
		objRegExp.Global = True
		RegExpReplace = objRegExp.Replace(Str, Replacement)
		Set objRegExp = Nothing
	End Function
	
	strTest = "abcd2""$$£"
	strPattern = "[^A-Za-z 0-9 \.,\?'""!@#\$%\^&\*\(\)-_=\+;:<>\/\\\|\}\{\[\]`~]*"
	strReplace = ""
	
	response.write strTest
	response.write "<br>"
	response.write RegExpReplace(strTest, strPattern, strReplace)
%>


and the more simplified yet complicated

Set objRegExp = New RegExp
objRegExp.Global = True
objRegExp.IgnoreCase = True
objRegExp.Pattern = "[^\x20-\x7E]"
EditorialReview = objRegExp.Replace(EditorialReview,"")


I haven't tried them, but they are supposed to remove non-ascii characters from a string.
Go to Top of Page

ILLHILL
Junior Member

Netherlands
341 Posts

Posted - 12 July 2007 :  15:48:28  Show Profile
Wow...that might be it!
I will try and feed back.
Thanks for the help!


That was it! It's working!!!!

Thank you very much PPSSWeb!!!

Greets & thanks, Dominic

CLPPR.com - All The News Only Seconds Away

Edited by - ILLHILL on 12 July 2007 16:46:01
Go to Top of Page

PPSSWeb
Junior Member

312 Posts

Posted - 13 July 2007 :  07:45:00  Show Profile
Glad it worked for you.

Just so I know, which one did you use and did you have to change it at all? Like I said, I didn't test them, but I want to add one to my bag of tricks reference information in case I ever need it.

Thanks,
Steve
Go to Top of Page

ILLHILL
Junior Member

Netherlands
341 Posts

Posted - 13 July 2007 :  12:53:16  Show Profile
I did only try the first one. (RegExpReplace(strTest, strPattern, strReplace))
My files for clppr contain an include with functions like Islike etc.
I added the function to that include and added these two lines underneath
(to prevent I have to call them from all the different files needed)

strPattern = "[^A-Za-z 0-9 \.,\?'""!@#\$%\^&\*\(\)-_=\+;:<>\/\\\|\}\{\[\]`~]*"
strReplace = ""

I already added a lot of character replacing for the string before it reaches the point
of the RegExpReplace.

So at that point it just looked like:
<a href=""link"" title=""" & RegExpReplace(string, strPattern, strReplace) & """> and it worked
like a charm from there on.

As for your bag of tricks, I would like to suggest to have a look here: http://www.livio.net/main/asp_functions.asp

I found some very handy things in there.


Greets & thanks, Dominic


CLPPR.com - All The News Only Seconds Away
Go to Top of Page

AnonJr
Moderator

United States
5768 Posts

Posted - 13 July 2007 :  13:24:43  Show Profile  Visit AnonJr's Homepage
Sweet link.
Go to Top of Page

PPSSWeb
Junior Member

312 Posts

Posted - 13 July 2007 :  14:06:32  Show Profile
Thanks, that link will definitely come in handy.
Go to Top of Page

ILLHILL
Junior Member

Netherlands
341 Posts

Posted - 15 July 2007 :  08:30:27  Show Profile
Glad I could at least, to some degree, give something back :)
There's a ton of brilliant stuff in there with good explanations on how to use them :)

Greets& thanks, D

CLPPR.com - All The News Only Seconds Away
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Topic Locked
 Printer Friendly
Jump To:
Snitz Forums 2000 © 2000-2021 Snitz™ Communications Go To Top Of Page
This page was generated in 0.24 seconds. Powered By: Snitz Forums 2000 Version 3.4.07