Snitz Forums 2000
Snitz Forums 2000
Home | Profile | Register | Active Topics | Members | Search | FAQ
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 Community Forums
 Code Support: ASP (Non-Forum Related)
 Lisp algorithm to vb asp
 New Topic  Topic Locked
 Printer Friendly
Author Previous Topic Topic Next Topic  

Podge
Support Moderator

Ireland
3776 Posts

Posted - 27 August 2007 :  19:04:58  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
I'm trying to convert a lisp algorithm to vb. This is the lisp alg.
(let ((g (* 2 (or (gethash word good) 0)))
      (b (or (gethash word bad) 0)))
   (unless (< (+ g b) 5)
     (max .01
          (min .99 (float (/ (min 1 (/ b nbad))
                             (+ (min 1 (/ g ngood))   
                                (min 1 (/ b nbad)))))))))
To divide in lisp is this format (/ dividend divisor) i.e. (/ 10 5) = 2
float(integer) just converts an integer to a float
The bit in red above should equate to the bit in red below. I think this is where the problem is.
vb don't have a min & max function so I have included my own below. I realise they could be prettier.

Function wordProb (word)
		nGood = 4000 ' number of good topics and replies 
		nBad = 4000 ' number of bad topics and replies 
		g = 10 ' good word frequency to be pulled from db
		b = 10 ' bad word frequency to be pulled from db
		g = g * 2 ' good word bias
		
		if (g + b) > 5 then ' only consider words with a frequency greater than 5
			calculateWordProbability = Max(.01,(Min(.99, (Min(1,b/nbad))))) / (Min(1,g/nGood) + Min(1,b/nBad))
		else
            		calculateWordProbability = 0.4 'word is not popular in the db so assign default value of .4
		end if

End function

Function Min(a, b)
	if a > b then 
		Min = b
	else 
		Min = a
	end if
end function

Function Max(a, b)
	if a > b then 
		Max = a
	else 
		Max = b
	end if
end function
I cannot test the lisp alg. as I don't have a lisp interpreter/compiler but I'm sure that values returned should be between 0.01 & 0.99. The hardcoded values above return 1.33333333333333.

Any lisp experts out there ? Anyone see where I'm going wrong ?

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.

Edited by - Podge on 27 August 2007 19:05:38

HuwR
Forum Admin

United Kingdom
20595 Posts

Posted - 28 August 2007 :  04:00:04  Show Profile  Visit HuwR's Homepage
I think your brackets are wrong, shouldn't it be more like

Max(.01,(Min(.99, (Min(1,b/nbad)))) / (Min(1,g/nGood) + Min(1,b/nBad)))

Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 28 August 2007 :  08:34:57  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
Good man HuwR. Looks like it worked.

I don't suppose you know how to combine probabilities ?

For example

.01 probability means its a good word
.05 probability means its a neutral word
.99 probability means its a bad word

The three words in a sentence have the following probabilities

.90
.97
.98

Then you can be 100% (.99 probability) that the sentence is a bad sentence.

How is this calculated mathematically ?

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.
Go to Top of Page

HuwR
Forum Admin

United Kingdom
20595 Posts

Posted - 28 August 2007 :  09:54:03  Show Profile  Visit HuwR's Homepage
sorry no ,couldn't you just keep a running count of badwords and if you have > 3 that are over 0.9 then flag the sentance as bad.
Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 28 August 2007 :  10:08:05  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
What happens though if you have 15 words with all different probabilities ranging from .01 to .99 ?

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.
Go to Top of Page

HuwR
Forum Admin

United Kingdom
20595 Posts

Posted - 28 August 2007 :  10:35:22  Show Profile  Visit HuwR's Homepage
surely you would just ignore the low probability ones since it is only the high probability words you are interested in
Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 28 August 2007 :  11:00:26  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
I cannot discard the low (good) probability ones as I need to know whether its a good or a bad sentence. If you discard the low ones you're automatically creating bias for all sentences towards being bad. Thats not good considering the function above includes two lines specifically to increase good word bias.

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.
Go to Top of Page

Shaggy
Support Moderator

Ireland
6780 Posts

Posted - 28 August 2007 :  11:04:30  Show Profile
Why not take the average of all words in the sentence and apply the same criteria to that average (0.01=good sentence, etc.)?


Search is your friend
“I was having a mildly paranoid day, mostly due to the
fact that the mad priest lady from over the river had
taken to nailing weasels to my front door again.”
Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 28 August 2007 :  11:21:54  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
Only because its not correct. Word pairs and sequences may be assigned probabilities in the future.

Like the example above;

Two words in a two word sentence have the following probabilities

.90
.90

Then you can be 100% (.99 probability) sure that the sentence is a bad sentence. The average is .90 but the correct probability is .99. Thats a 10% discrepancy.

I don't have a lot of room for errors to creep in as there may be other factors which will skew the results. If I can find a mathematical way of combining probabilities I'll post it here and work on a function from there.

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.

Edited by - Podge on 28 August 2007 11:23:30
Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 28 August 2007 :  11:50:59  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
Thanks for the help & suggestions guys. I found out how;

To combine two probabilities
ab
-------------------
ab + (1 - a)(1 - b)

To combine three
abc           
---------------------------
abc + (1 - a)(1 - b)(1 - c)


....and so on.

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.
Go to Top of Page

HuwR
Forum Admin

United Kingdom
20595 Posts

Posted - 28 August 2007 :  12:09:05  Show Profile  Visit HuwR's Homepage
I'm not sure your logic holds up, it shouldn't make any difference how many good words are in a sentance if it contains just one .99 word then it ought to be flagged bad regardless of how many good words were in the sentance, if not things like bayesian filters would be pretty easy to fool by flooding the text with "good" words.
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Topic Locked
 Printer Friendly
Jump To:
Snitz Forums 2000 © 2000-2021 Snitz™ Communications Go To Top Of Page
This page was generated in 0.27 seconds. Powered By: Snitz Forums 2000 Version 3.4.07