Author Topic: [PHP] URL Regular Expression  (Read 16400 times)

0 Members and 1 Guest are viewing this topic.

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
[PHP] URL Regular Expression
« on: September 11, 2005, 10:52:44 pm »
Anyone have one that works?  I've found a few on the internet, but it's been so long since I've created my own regular expressions.

I'm looking for one that follows the syntax:

<protocol>://<subdomain>.<domain>.<com/net,etc>

Thanks in advance!

Edit --

Since no one who posted in this thread was able to come up with one, I was forced to do it on my own!  *sob*

So none of the rest of you have to suffer through such horrid events, I'll post my solution at the top of the thread:

Code: [Select]
$search[] =

"^(((ht|f)tp(s?))\:\/\/)(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.([a-z])(\:[0-9]+)*((\/?))(([a-zA-Z0-9\.

\,\;\?\'\\\=\/\_\-\#]+)?)^";
« Last Edit: September 14, 2005, 04:37:55 pm by Sidoh »

Offline deadly7

  • 42
  • Moderator
  • Hero Member
  • *****
  • Posts: 6496
    • View Profile
Re: [PHP] URL Regular Expression
« Reply #1 on: September 11, 2005, 10:59:13 pm »
One word: Huh?
[17:42:21.609] <Ergot> Kutsuju you're girlfrieds pussy must be a 403 error for you
 [17:42:25.585] <Ergot> FORBIDDEN

on IRC playing T&T++
<iago> He is unarmed
<Hitmen> he has no arms?!

on AIM with a drunk mythix:
(00:50:05) Mythix: Deadly
(00:50:11) Mythix: I'm going to fuck that red dot out of your head.
(00:50:15) Mythix: with my nine

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #2 on: September 11, 2005, 11:03:41 pm »
This is the one I found:

Code: [Select]
/^(((ht|f)tp(s?))\:\/\/)?(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.(com|edu|gov|mil|net|org|biz|info|name|museum|us|ca|uk)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+))*$/
I'm not too sure what I need to begin or truncate that string with, but it doesn't work with preg_replace.

Offline Quik

  • Webmaster Guy
  • x86
  • Hero Member
  • *****
  • Posts: 3262
  • \x51 \x75 \x69 \x6B \x5B \x78 \x38 \x36 \x5D
    • View Profile
Re: [PHP] URL Regular Expression
« Reply #3 on: September 11, 2005, 11:04:45 pm »
Maybe you could include some information about what you need this for.
Quote
[20:21:13] xar: i was just thinking about the time iago came over here and we made this huge bomb and light up the sky for 6 min
[20:21:15] xar: that was funny

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #4 on: September 11, 2005, 11:31:50 pm »
Like this:

http://www.google.com

See how SMF automatically generates an ancor tag becaue it recognizes it as a link?  That's what I'm wanting.

Offline Quik

  • Webmaster Guy
  • x86
  • Hero Member
  • *****
  • Posts: 3262
  • \x51 \x75 \x69 \x6B \x5B \x78 \x38 \x36 \x5D
    • View Profile
Re: [PHP] URL Regular Expression
« Reply #5 on: September 11, 2005, 11:39:15 pm »
If 'http://' is present, it creates <a href=" ...
Quote
[20:21:13] xar: i was just thinking about the time iago came over here and we made this huge bomb and light up the sky for 6 min
[20:21:15] xar: that was funny

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #6 on: September 11, 2005, 11:50:22 pm »
It's more elaborate than that.

ftp://www.something.com

And I would probably settle for that, but I'm not sure how to translate it into a regular expression, which is why I'm creating this post.

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #7 on: September 12, 2005, 05:29:59 pm »
Someone's got to have one... :(

Offline Blaze

  • x86
  • Hero Member
  • *****
  • Posts: 7136
  • Canadian
    • View Profile
    • Maide
Re: [PHP] URL Regular Expression
« Reply #8 on: September 12, 2005, 06:57:00 pm »
I'd search for www, http and ://.
And like a fool I believed myself, and thought I was somebody else...

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #9 on: September 12, 2005, 07:02:48 pm »
I'd search for www, http and ://.

I know that... :P

I want it translated into a Regular Expression.  That's why I put regex in the title!  :D

Offline Ryan Marcus

  • Cross Platform.
  • Full Member
  • ***
  • Posts: 170
  • I'm Bono.
    • View Profile
    • My Blog
Re: [PHP] URL Regular Expression
« Reply #10 on: September 13, 2005, 04:39:19 pm »
Its not all that hard.

Method 1: Split the message into an array seperated by spaces. One element per word. Then, use parse_url. Slow and clunky.
Method 2: Use strpos to search find "http", "://", ".com", ".net" and ".org". Use a buffer type method.
Method 3: Use a good WYSWYG web based editor, like the one in exponent.

Hope I helped...

Thanks, Ryan Marcus

Quote
<OG-Trust> I BET YOU GOT A CAR!
<OG-Trust> A JAPANESE CAR!
Quote
deadly: Big blue fatass to the rescue!
496620796F75722072656164696E6720746869732C20796F75722061206E6572642E00

Offline Blaze

  • x86
  • Hero Member
  • *****
  • Posts: 7136
  • Canadian
    • View Profile
    • Maide
Re: [PHP] URL Regular Expression
« Reply #11 on: September 13, 2005, 04:40:33 pm »
How about you combine method2 and 1...

Search for it, then use that function.
And like a fool I believed myself, and thought I was somebody else...

Offline Ryan Marcus

  • Cross Platform.
  • Full Member
  • ***
  • Posts: 170
  • I'm Bono.
    • View Profile
    • My Blog
Re: [PHP] URL Regular Expression
« Reply #12 on: September 13, 2005, 04:41:43 pm »
Why bother? Once you know its a URL, you don't need to parse it.. Just add the <a> tag.
Thanks, Ryan Marcus

Quote
<OG-Trust> I BET YOU GOT A CAR!
<OG-Trust> A JAPANESE CAR!
Quote
deadly: Big blue fatass to the rescue!
496620796F75722072656164696E6720746869732C20796F75722061206E6572642E00

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #13 on: September 13, 2005, 04:49:51 pm »
Its not all that hard.

Method 1: Split the message into an array seperated by spaces. One element per word. Then, use parse_url. Slow and clunky.
Method 2: Use strpos to search find "http", "://", ".com", ".net" and ".org". Use a buffer type method.
Method 3: Use a good WYSWYG web based editor, like the one in exponent.

Hope I helped...


You're thinking of too abstract a method.

I want a regular expression that defines a URL.  I posted one, but there's obviously something incorrect about it--it doesn't work.

Regular Expressions are vastly more efficient than anything you posted.  There's no reason for me to use a WYSIWYG editor.  I'm developing a function that will search and define links in dynamic content.  I just need a regular expression that accurately defines a URL.  After this, I'd just use preg_replace to replace all URL's to a URL+Anchor.

It's been a long time since I've worked with regular expressions much, and I was hoping there was someone here who's more fresh with them than I.

Offline Joe

  • B&
  • x86
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: [PHP] URL Regular Expression
« Reply #14 on: September 13, 2005, 10:53:25 pm »
for(int i = 0; i < strlen(data); i++) {
  if (substr(data, i, 7) == "http://") {
    // url starts here
  } elseif (substr(data, i, 6) == "ftp://") {
    // omfg another url!
  } elseif (something) {
    // you get the drift
  } else {
    // LOL NOTHING
  }
}


EDIT -
I had my less than sign backwards, as usual. -sigh-
« Last Edit: September 13, 2005, 11:01:14 pm by Joe[e2] »
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #15 on: September 13, 2005, 11:04:19 pm »
Inefficient.

Code: [Select]

$message = preg_replace("<REGULAR EXPRESSION THAT WORKS>", '<a href="\1">\1</a>', $message);


SO PWNS that.  I just need a working regular expression.  I'm shocked none of you have worked with them.  O_o

Well, not so much you, Joe.  You're a VB person =p

Offline Joe

  • B&
  • x86
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: [PHP] URL Regular Expression
« Reply #16 on: September 13, 2005, 11:56:54 pm »
<?
  
echo findurl("ftp://www.test.org str1 http://www.test.com str2 ftp://www.x86labs.org str3");

  function 
findurl($data) {
    
$ret "";
    
$words explode(" "$data);
    foreach(
$words as $word) {
      if (
substr($word07) == "http://") {
        
$ret $ret '<a href="' $word '">' $word "</a> ";
      }elseif (
substr($word06) == "ftp://") {
        
$ret $ret '<a href="' $word '">' $word "</a> ";
      }else{
        
$ret $ret $word " ";
      }
    }
    return 
$ret;
  }
?>


*backs away while it still works*

http://www.javaop.com/~joe/url.php
« Last Edit: September 15, 2005, 08:03:42 pm by Joe[e2] »
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #17 on: September 14, 2005, 12:04:40 am »
Thanks, but I'm going to stick to finding a working regular expression.  :P

Offline Joe

  • B&
  • x86
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: [PHP] URL Regular Expression
« Reply #18 on: September 14, 2005, 12:23:30 am »
A WORKING? WTF U SMOKIN NGR?
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline Mythix

  • The Dude
  • x86
  • Hero Member
  • *****
  • Posts: 1569
  • Victory
    • View Profile
    • Dark-Wire
Re: [PHP] URL Regular Expression
« Reply #19 on: September 14, 2005, 02:45:52 pm »
pft we all know joe hardcoded those links.
Philosophy, n. A route of many roads leading from nowhere to nothing.

- Ambrose Bierce


Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #20 on: September 14, 2005, 04:32:26 pm »
Code: [Select]
<?php

$seach = array();
$replace = array();

$search[] = 

"^(((ht|f)tp(s?))\:\/\/)(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.([a-z])(\:[0-9]+)*((\/?))(([a-zA-Z0-9\.

\,\;\?\'\\\=\/\_\-\#]+)?)^"
;
$replace[] = '<a href="\0" target="_blank">\0</a>';

$string "Hello.  This is a test string.  http://www.google.com  <br /><br />I hope this 

serves as proof that regular expressions > all.  http://sidoh.no-ip.org/reg-ex.php FTW. <br /><br 

/> Visit http://www.x86labs.org/forum/index.php?topic=2790.msg27101#msg27101 for more 

information!"
;

$string preg_replace($search$replace$string);

echo $string;

?>

http://sidoh.no-ip.org/reg-ex.php

Owned.
« Last Edit: September 14, 2005, 04:35:31 pm by Sidoh »

Offline Joe

  • B&
  • x86
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: [PHP] URL Regular Expression
« Reply #21 on: September 14, 2005, 05:07:38 pm »
You should be shot? =p
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #22 on: September 15, 2005, 12:55:27 pm »

Offline deadly7

  • 42
  • Moderator
  • Hero Member
  • *****
  • Posts: 6496
    • View Profile
Re: [PHP] URL Regular Expression
« Reply #23 on: September 15, 2005, 06:11:58 pm »
Sidoh, do you always mess up your code?
Code: [Select]
<?php

[
b]$seach = array();[/b]
$replace = array();

[b]$search[] = [/b]

"^(((ht|f)tp(s?))\:\/\/)(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.([a-z])(\:[0-9]+)*((\/?))(([a-zA-Z0-9\.

\,\;\?\'\\\=\/\_\-\#]+)?)^"
;
$replace[] = '<a href="\0" target="_blank">\0</a>';

$string "Hello. This is a test string. http://www.google.com <br /><br />I hope this 

serves as proof that regular expressions > all. http://sidoh.no-ip.org/reg-ex.php FTW. <br /><br 

/> Visit http://www.x86labs.org/forum/index.php?topic=2790.msg27101#msg27101 for more 

information!"
;

$string preg_replace($search$replace$string);

echo $string;

?>

http://sidoh.no-ip.org/reg-ex.php

Owned.
[17:42:21.609] <Ergot> Kutsuju you're girlfrieds pussy must be a 403 error for you
 [17:42:25.585] <Ergot> FORBIDDEN

on IRC playing T&T++
<iago> He is unarmed
<Hitmen> he has no arms?!

on AIM with a drunk mythix:
(00:50:05) Mythix: Deadly
(00:50:11) Mythix: I'm going to fuck that red dot out of your head.
(00:50:15) Mythix: with my nine

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #24 on: September 15, 2005, 06:16:32 pm »
Know that [ php] is a syntax highlighted tag, so don't expect the bold tag will work in the future.

Additionally, since PHP contains the ability to define variables implicitly, I didn't mess up my code.  It works fine, there's just memory allocated for an array that isn't used.  :P
« Last Edit: September 15, 2005, 06:18:33 pm by Sidoh »

Offline Joe

  • B&
  • x86
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: [PHP] URL Regular Expression
« Reply #25 on: September 15, 2005, 08:03:05 pm »
echo(findurl(""HelloThis is a test stringhttp://www.google.com <br /><br />I hope this serves as proof that regular expressions > all. http://sidoh.no-ip.org/reg-ex.php FTW. <br /><br /> Visit http://www.x86labs.org/forum/index.php?topic=2790.msg27101#msg27101 for more information!");
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #26 on: September 15, 2005, 08:17:50 pm »
echo(findurl(""HelloThis is a test stringhttp://www.google.com <br /><br />I hope this serves as proof that regular expressions > all. http://sidoh.no-ip.org/reg-ex.php FTW. <br /><br /> Visit http://www.x86labs.org/forum/index.php?topic=2790.msg27101#msg27101 for more information!");

o_O

Offline Joe

  • B&
  • x86
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: [PHP] URL Regular Expression
« Reply #27 on: September 15, 2005, 08:58:53 pm »
You can fix it.
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #28 on: September 15, 2005, 09:02:22 pm »

Offline Blaze

  • x86
  • Hero Member
  • *****
  • Posts: 7136
  • Canadian
    • View Profile
    • Maide
Re: [PHP] URL Regular Expression
« Reply #29 on: September 15, 2005, 09:13:06 pm »

<?php

echo(findurl("Hello. This is a test string. http://www.google.com <br /><br />I hope this serves as proof that regular expressions > all. http://sidoh.no-ip.org/reg-ex.php FTW. <br /><br /> Visit http://www.x86labs.org/forum/index.php?topic=2790.msg27101#msg27101 for more information!"));

?>


Like that?
And like a fool I believed myself, and thought I was somebody else...

Offline deadly7

  • 42
  • Moderator
  • Hero Member
  • *****
  • Posts: 6496
    • View Profile
Re: [PHP] URL Regular Expression
« Reply #30 on: September 15, 2005, 09:13:45 pm »
Know that [ php] is a syntax highlighted tag, so don't expect the bold tag will work in the future.

Additionally, since PHP contains the ability to define variables implicitly, I didn't mess up my code.  It works fine, there's just memory allocated for an array that isn't used.  :P
Which is stupid. :)
[17:42:21.609] <Ergot> Kutsuju you're girlfrieds pussy must be a 403 error for you
 [17:42:25.585] <Ergot> FORBIDDEN

on IRC playing T&T++
<iago> He is unarmed
<Hitmen> he has no arms?!

on AIM with a drunk mythix:
(00:50:05) Mythix: Deadly
(00:50:11) Mythix: I'm going to fuck that red dot out of your head.
(00:50:15) Mythix: with my nine

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #31 on: September 15, 2005, 09:16:01 pm »

<?php

echo(findurl("Hello. This is a test string. http://www.google.com <br /><br />I hope this serves as proof that regular expressions > all. http://sidoh.no-ip.org/reg-ex.php FTW. <br /><br /> Visit http://www.x86labs.org/forum/index.php?topic=2790.msg27101#msg27101 for more information!"));

?>


Like that?

/golfclap  :P

Offline Blaze

  • x86
  • Hero Member
  • *****
  • Posts: 7136
  • Canadian
    • View Profile
    • Maide
Re: [PHP] URL Regular Expression
« Reply #32 on: September 15, 2005, 11:20:58 pm »
I don't get it...  :-[
And like a fool I believed myself, and thought I was somebody else...

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #33 on: September 15, 2005, 11:38:23 pm »
Know that [ php] is a syntax highlighted tag, so don't expect the bold tag will work in the future.

Additionally, since PHP contains the ability to define variables implicitly, I didn't mess up my code.  It works fine, there's just memory allocated for an array that isn't used.  :P
Which is stupid. :)

Blame PHP, not me.

Offline Sidoh

  • Moderator
  • Hero Member
  • *****
  • Posts: 17634
  • MHNATY ~~~~~
    • View Profile
    • sidoh
Re: [PHP] URL Regular Expression
« Reply #34 on: October 03, 2005, 09:33:01 pm »
Okay, this is really pretty frusturating.  I should have realized it earlier.  My bbcode function replaces [link ] tags, which also follow the format of this regular expression.  When it replaces it with the HTML <a> tag, the actual URL (which follows the pattern of the regular expression) contained in it is replaced by the regular expression.  It's really annoying, and I found a way around it, but it's not ideal.  I think I'm going to have to resort to using some search/buffer/replace type method like was originally suggested.  Even though this is a lot slower and less efficient, it's really the only methodology I can forsee working in every case.

Here's the regular expression I used (it works great!):

Code: [Select]
$s[] = "^[\s]+(((ht|f)tp(s?))\:\/\/)(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.([a-z])(\:[0-9]+)*((\/?))(([a-zA-Z0-9\.\,\;\?\'\\\=\/\_\-\#\%\&]+)?)^";
$r[] = '<a href="\0" target="_blank">\0</a>';

Notice the "[\s]+", which indicates some amount of whitespace (space, linebreak, tab, etc) prepending the URL, as long as it's there.  This works in most cases, but what if the link is at the beginning of the text or IS the entire text?  Yeah, doesn't work in those situations. 

Without it, though, the following regular expression causes issues because of the nature it represents:

Code: [Select]
$s[] = "#\[(link|url)=(((ht|f)tp(s?))\:\/\/)(.*?)](.*?)\[/link\]#si";
$r[] = '<a href="\2\6" target="_blank">\7</a>';
This is a [ link ] [ /link ] tag.  It also works fine in isolation, but it also gets replaced with the following after the URL regular expression runs through preg_replace:

[link=<a href="<URL>"><URL></a>]<text>[/link]

There's one possible solution I can think of still retaining the regular expression methodology, but I haven't gotten it to work.  That is: If it matches this string and "DISCLUDES" this string.  IE it can't begin with url= or link=, which would eliminate my problem with this.

I'm going to try to mess around with a bit of search/buffer methodologies of URL replacing.  I'll post with updates.
« Last Edit: October 03, 2005, 09:35:12 pm by Sidoh »