Snitz Forums 2000
Snitz Forums 2000
Home | Profile | Register | Active Topics | Members | Search | FAQ
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 Community Forums
 Code Support: ASP (Non-Forum Related)
 Extract Subdomain Regex
 New Topic  Topic Locked
 Printer Friendly
Author Previous Topic Topic Next Topic  

Podge
Support Moderator

Ireland
3776 Posts

Posted - 02 December 2004 :  17:02:02  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
I'm trying to get the subdomain out of http://subdomain.domain.com

Currently I'm using this (not pretty but it works) code.

subDomain = request.servervariables("SERVER_NAME")
myArray = Split(subDomain, ".")
subDomain = myArray(0)

As long as there is only one "." it should work.

Anyone know a better (regex?) way of doing it ?

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.

pdrg
Support Moderator

United Kingdom
2897 Posts

Posted - 03 December 2004 :  06:27:57  Show Profile  Send pdrg a Yahoo! Message
do you want the bit after the http// or the bit 2 to the left of the .com?

I see problems either way - for instance http://www.someusername.freeserve.co.uk would stump both of the above. You need to know the dataset you'll be using, and work backwards from there, I suspect.
Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 03 December 2004 :  07:27:58  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
The bit to the left of domain.com e.g.

http://subdomain.domain.com
http://www.subdomain.domain.com

I also need to check that the subdomain only contains letters and numbers.

I can do it with string manipulation but I reckon a regex would be faster.

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.
Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 03 December 2004 :  08:03:27  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
I should have mentioned that the domain.com will always be the same.

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.
Go to Top of Page

pdrg
Support Moderator

United Kingdom
2897 Posts

Posted - 03 December 2004 :  08:04:27  Show Profile  Send pdrg a Yahoo! Message
OK I get ya, but if you break the problem out more - eg how do you cope with .co.uk domains - it will help you envision the solution a bit easier.

I agree in principle with your idea to use regex, but even using regex, you still need to work out how to handle exceptions (like .co.uk). And also can subdomains not contain magic characters '-','_'?. Will all domains come with the http:// bit?

the regex pattern would be something like (untested):

(?:http\:\/\/)(?:www\.)([A-Za-z0-9_-]*)\.[A-Za-z0-9._-]*$

optional, non-consumed, http:// and www., first consumed match is any characters, numbers or _ or - up to the next dot, then read and discard to the end of the string

Someone may tidy this up a bit, but assuming you just feed it domains, should handle it (but check the escaped characters need to be escaped - may cause hiccoughs otherwise!)

hth


edit: this will not work *if there is no subdomain*, but that wasn't part of the request!

Edited by - pdrg on 03 December 2004 08:05:12
Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 03 December 2004 :  08:51:07  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
Thanks for taking the time to help me with this. Its a good start.

Basically I'm trying to find the most efficient code which would return the subdomain to me in a string.

I won't be checking lots of different domains and their subdomains. It will be on one particular domain. There won't be any .net or .co.uk.

I'll outline the rules as best I can.

Although subdomains could contain "-" or "_", they would be illegal for my purposes. All subdomains should only have letters or numbers and are of at least length 4.

If no subdomain exists, thats already catered for by DNS (as are A HOSTS for mail, ftp, www etc.). DNS will redirect to the correct website, etc. for those and the regex won't get a chance to run.

This is what would happen in the following situations

http://sub1.domain.com - I need "sub1" as a string
http://sub2.sub1.domain.com - I need "sub1" as a string
http://sub-marine.domain.com - Return an error

I don't need to check for http:// or https:// or that the domain.com is valid.
It will always be valid.


Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.
Go to Top of Page

pdrg
Support Moderator

United Kingdom
2897 Posts

Posted - 03 December 2004 :  09:20:15  Show Profile  Send pdrg a Yahoo! Message
And will that string be clean? ie the domain won't be in the middle of a load of 'stuff'?

(?:https?\:\/\/)(?:[A-Za-z0-9]\.?)([A-Za-z0-9]{4,})\.domain\.com should do it!

(?:https?\:\/\/) non-capturing http:// or https://
(?:[A-Za-z0-9\-_]*\.?) should non-capture the subdomain with optional dot
([A-Za-z0-9]{4,}) should capture the alphanumeric string at least 4 characters long
\.domain\.com note the \. escapes the dot to a literal string

caveat - would need a bit of in-situ testing as (?:[A-Za-z0-9\-_]*\.?) term may also match sub2 AND sub1 - if it does, this will need some tinkering, or you may find the sub1 match will not be the zero'th submatch in the submatch collection returned by the regex object, but maybe the first.

hope this is helpful! Maybe someone else wants to point out if this is a hiding to nothing, but if we go down the regex route, I think the above is 'about right'
Go to Top of Page

Podge
Support Moderator

Ireland
3776 Posts

Posted - 03 December 2004 :  09:24:40  Show Profile  Send Podge an ICQ Message  Send Podge a Yahoo! Message
quote:
And will that string be clean? ie the domain won't be in the middle of a load of 'stuff'?

Now I get you. It will have stuff after it like /index.asp?querystring=12

I'll give it a test with a few subdomains later and see how it works.

Thanks again Pdrg.

Podge.

The Hunger Site - Click to donate free food | My Blog | Snitz 3.4.05 AutoInstall (Beta!)

My Mods: CAPTCHA Mod | GateKeeper Mod
Tutorial: Enable subscriptions on your board

Warning: The post above or below may contain nuts.

Edited by - Podge on 03 December 2004 09:25:04
Go to Top of Page

pdrg
Support Moderator

United Kingdom
2897 Posts

Posted - 03 December 2004 :  09:31:44  Show Profile  Send pdrg a Yahoo! Message
(?:https?\:\/\/)(?:[A-Za-z0-9]\.?)([A-Za-z0-9]{4,})\.domain\.com.*

note the ending .* which will match (and ignore) the rest of the string - however not sure it's even needed! Sorry I cannot test all this for you, but I haven't got the kit with me so this is largely hypothetical!!
Go to Top of Page

-gary
Development Team Member

406 Posts

Posted - 03 December 2004 :  11:26:09  Show Profile
Here's an online tester with the ability to toggle things like case and line break detection. They also have tons of code samples.

http://www.regexlib.com

KawiForums.com


Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Topic Locked
 Printer Friendly
Jump To:
Snitz Forums 2000 © 2000-2021 Snitz™ Communications Go To Top Of Page
This page was generated in 1.22 seconds. Powered By: Snitz Forums 2000 Version 3.4.07