Webmaster talk
Webmaster talk
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Forums
Web Marketing Forums
Adsense talk
AdWords Talk
Yahoo Search Marketing Talk
Website Marketing Talk
Link Development Forums
Link Development Talk
Link Sales Talk
Link Trade Talk
Link Exchange
SEO Forums
Search Engines Optimization Talk
Google SEO Talk
Yahoo SEO Talk
DMOZ Talk
Hosting Forums
Free 100 Mb hosting
Free Hosting
Free Forum hosting
Hosting
Web Hosting
Dedicated Servers
Domains Forums
Domain Sales Talk
Domain Appraisal Talk
Domain Registrars Talk
Expired Domain Names Talk
Design Forums
Photoshop
Content Talk
Templates Talk
Other
Payment Processing Talk
phpBB styles
Forums
Site Sales Talk
Redirecting double-slash ("//") URLs to single-slash ("/") U

 
Post new topic   Reply to topic    Webmaster talk Forum Index -> Webmaster talk
View previous topic :: View next topic  
Author Message
Guy Macon
Guest





PostPosted: Mon Jul 07, 2008 11:34 pm    Post subject: Redirecting double-slash ("//") URLs to single-slash ("/") U Reply with quote

As a personal learning experience with limited practical use,
I have been doing some experiments with using .htaccess to
redirect mis-typed URLs to a preferred canonical form. I have
set up a test page at [ http://www.guymacon.org/test.html ]
to show the results of my testing.

Most of the URLs redirect as I want them to do, but the three
URLS (with "//" instead of "/") in bold do not redirect.

I have searched the web and have not found a single website
that redirects *all* "//" URLs to "/" URLs.

Given the rarity of this error, a solution that causes a 404
error rather than a 301 redirect would be fine with me, but I
haven't seen any websites that manage that one either.

Any suggestions for things to try would be most welcome.
Thanks!

Page that demonstrates problem: [ http://www.guymacon.org/test.html ]


--
Guy Macon
<http://www.GuyMacon.com/>
Back to top
  Ads
Advertising
Sponsor


Mark Goodge
Guest





PostPosted: Tue Jul 08, 2008 12:32 am    Post subject: Re: Redirecting double-slash ("//") URLs to single-slash ("/ Reply with quote

On Mon, 07 Jul 2008 18:34:04 +0000, Guy Macon
<http://www.GuyMacon.com/> put finger to keyboard and typed:

Quote:
As a personal learning experience with limited practical use,
I have been doing some experiments with using .htaccess to
redirect mis-typed URLs to a preferred canonical form. I have
set up a test page at [ http://www.guymacon.org/test.html ]
to show the results of my testing.

Most of the URLs redirect as I want them to do, but the three
URLS (with "//" instead of "/") in bold do not redirect.

It isn't working because it's only redirecting if the requested
document is not a validly existing file or directory. The web server
isn't determining what constitutes an existing file or directory, it's
letting the underlying OS tell it that. But a double slash anywhere in
the file path is a perfectly valid file path on a Unix-like system, so
the OS returns it as valid and hence the web server doesn't redirect.

You can see this in action on a website that doesn't use mod_rewrite.
For example, this is a valid page on one of my own sites:

http://www.good-stuff.co.uk/links.php

That site doesn't use mod_rewrite. But this will still work:

http://www.good-stuff.co.uk//links.php

and so will this:

http://www.good-stuff.co.uk///links.php

On the same site, a URL in a subdirectory shows the same behaviour:

http://www.good-stuff.co.uk/crossword/
http://www.good-stuff.co.uk//crossword/
http://www.good-stuff.co.uk///crossword/

Quote:
I have searched the web and have not found a single website
that redirects *all* "//" URLs to "/" URLs.

I don't think it can be done with .htaccess, for the reason I've
described.

Quote:
Given the rarity of this error, a solution that causes a 404
error rather than a 301 redirect would be fine with me, but I
haven't seen any websites that manage that one either.

Any suggestions for things to try would be most welcome.
Thanks!

If you wanted to fix this particular problem, you'd need to do it by
means of server side scripting. But it's hardly likely to be a
problem, anyway.

Mark
Back to top
  Ads
Advertising
Sponsor


Guy Macon
Guest





PostPosted: Tue Jul 08, 2008 2:32 am    Post subject: Re: Redirecting double-slash ("//") URLs to single-slash ("/ Reply with quote

Mark Goodge wrote:
Quote:

Guy Macon <http://www.GuyMacon.com/> put finger to keyboard and typed:

I have been doing some experiments with using .htaccess to
redirect mis-typed URLs to a preferred canonical form. I have
set up a test page at [ http://www.guymacon.org/test.html ]
to show the results of my testing.

Most of the URLs redirect as I want them to do, but the three
URLS (with "//" instead of "/") in bold do not redirect.

It isn't working because it's only redirecting if the requested
document is not a validly existing file or directory. The web server
isn't determining what constitutes an existing file or directory, it's
letting the underlying OS tell it that. But a double slash anywhere in
the file path is a perfectly valid file path on a Unix-like system, so
the OS returns it as valid and hence the web server doesn't redirect.

Are you saying that // is valid and equal to / ? If so, that
would explain Apache *never* rewriting the // URL with a / URL,
but my tests show that using .htaccess I can:

301 Redirect http://www.guymacon.org/subdirectory//
to http://www.guymacon.org/subdirectory/
and
301 Redirect http://www.guymacon.org//subdirectory
to http://www.guymacon.org/subdirectory/

But I can *not* 301 redirect these they stay the same):
http://www.guymacon.org//
http://www.guymacon.org//subdirectory/
http://www.guymacon.org//subdirectory/not-index.html

And this URL:
http://www.guymacon.org///subdirectory///not-index.html
redirects to a partially fixed URL:
http://www.guymacon.org//subdirectory/not-index.html
(!)

Or are you you saying that // is valid and signifies two levels
of subdirectory whereas / signifies one level of subdirectory?
If so, every one of the above URLs is to a file or directory
that does not exist, and thus should redirect like all the
other missing files/directories, or at lest return a 404 error.

Quote:
[the following are all] valid page[s] on one of my own sites:
http://www.good-stuff.co.uk/links.php
http://www.good-stuff.co.uk//links.php
http://www.good-stuff.co.uk///links.php
http://www.good-stuff.co.uk/crossword/
http://www.good-stuff.co.uk//crossword/
http://www.good-stuff.co.uk///crossword/
[...and they all work]


Quote:
I don't think it can be done with .htaccess, for the reason I've
described.


I just did some tests.

I tested this URL with WebBug (a very useful utility):
http://www.good-stuff.co.uk///crossword///index.php

Which caused the WebBug "browser" to send:

GET ///crossword///index.php HTTP/1.1
Host: www.good-stuff.co.uk
Connection: close
Accept: */*
User-Agent: WebBug/5.0

and your server returned:

HTTP/1.1 200 OK
Date: Mon, 07 Jul 2008 21:03:15 GMT
Server: Apache/2.2.3 (Debian) PHP/5.2.0-8+etch11
X-Powered-By: PHP/5.2.0-8+etch11
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
....
200a
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
....and so on, serving up your entire web page.

When I did the same test using
http://www.good-stuff.co.uk///crossword///index.php

My GET changed from ///crossword///index.php to
/crossword/, and your server gave an identical
response.

So far, this behavior fits your theory.


Next I did the same test with
http://www.guymacon.org///subdirectory///not-index.html

Which caused the WebBug "browser" to send:

GET ///subdirectory///not-index.html HTTP/1.1
Host: www.guymacon.org
Connection: close
Accept: */*
User-Agent: WebBug/5.0

And my web server returned:

HTTP/1.1 301 Moved Permanently
Date: Mon, 07 Jul 2008 20:58:00 GMT
Server: Apache/2.2.9
Location: http://www.guymacon.org//subdirectory/not-index.html
Content-Length: 332
Connection: close
Content-Type: text/html; charset=iso-8859-1
....
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head><title>301 Moved Permanently</title></head>
<body><h1>Moved Permanently</h1><p>The document has moved
<a href="http://www.guymacon.org//subdirectory/not-index.html">here</a>.
</p><hr><address>Apache/2.2.9 Server at www.guymacon.org Port 80</address>
</body></html>

Next I did the same test with
http://www.guymacon.org/subdirectory/not-index.html

Which caused the WebBug "browser" to send:

GET /subdirectory/not-index.html HTTP/1.1
Host: www.guymacon.org
Connection: close
Accept: */*
User-Agent: WebBug/5.0

And my web server returned:

HTTP/1.1 200 OK
Date: Mon, 07 Jul 2008 21:01:06 GMT
Server: Apache/2.2.9
Last-Modified: Wed, 02 Jul 2008 00:03:14 GMT
ETag: "a3c534-3f4-450ff380b7480"
Accept-Ranges: bytes
Content-Length: 1012
Connection: close
Content-Type: text/html; charset=us-ascii
Content-Language: en-us
....
<!DOCTYPE html PUBLIC "ISO/IEC 15445:2000//DTD HyperText Markup Language//EN">
....and so on, serving up my entire web page.

The above tests show Apache changing some // strings to / strings,
but not all. This seems to argue against your theory.

Quote:
I have searched the web and have not found a single website
that redirects *all* "//" URLs to "/" URLs.

Given the rarity of this error, a solution that causes a 404
error rather than a 301 redirect would be fine with me, but I
haven't seen any websites that manage that one either.

Any suggestions for things to try would be most welcome.
Thanks!

If you wanted to fix this particular problem, you'd need to do it by
means of server side scripting. But it's hardly likely to be a
problem, anyway.

I suspect that server side scripting will fail for the same reasons
that .htaccess fails -- I think Apache is mucking with the some URLs
before them on to htaccess *or* server side scripting, but I am more
than willing to try. What do you think of the following as a starting
point?

http://www.seoworkers.com/seo-articles-tutorials/permanent-redirects.html

(My server runs Apache 2.2.4 and supports mod_php 5.2.4, Perl 5.8.8
and Ruby 1.8.5.)

BTW, before someone chimes in and says that what I am trying to
do is not worth doing, the redirect may not be worth doing, but
to me learning how Apache handles various URLs is well worth
the effort of running a few tests and trying a few suggestions.
Education is almost always a good thing.

--
Guy Macon
<http://www.GuyMacon.com/>
Back to top
  Ads
Advertising
Sponsor


Mark Goodge
Guest





PostPosted: Wed Jul 09, 2008 12:19 am    Post subject: Re: Redirecting double-slash ("//") URLs to single-slash ("/ Reply with quote

On Mon, 07 Jul 2008 21:32:12 +0000, Guy Macon
<http://www.GuyMacon.com/> put finger to keyboard and typed:

Quote:



Mark Goodge wrote:

If you wanted to fix this particular problem, you'd need to do it by
means of server side scripting. But it's hardly likely to be a
problem, anyway.

I suspect that server side scripting will fail for the same reasons
that .htaccess fails -- I think Apache is mucking with the some URLs
before them on to htaccess *or* server side scripting, but I am more
than willing to try.

No; doing it with scripting will be fine. The actual path is passed to
Apache which in turn passes it on the the script. If your server
supports PHP, then upload a phpinfo() page in various places and try
looking at it with different combinations of slashes in the URL.
You'll see (or should see!) that the actual path shows up in the
SCRIPT_URI and REQUEST_URI variables (among others).

Another interesting thing you'll see, though, with phpinfo() and
multiple slashes is that the additional slashes don't show up in the
SCRIPT_FILENAME variable.

If you were writing a PHP script to redirect on multiple slashes,
therefore, you could fairly easily pick them out by comparing (say)
the SCRIPT_NAME and REQUEST_URI variables and issuing a redirect to
the former if they don't match. You'd need to be careful if you were
using mod_rewrite for other purposes (such as "SEO-friendly URLS") on
the site, as I suspect it could have odd effects, but if you just want
to shift one PHP page back to itself but in the correct location then
this would be suitable. Here's an example on my site. This works as
expected:

http://www.good-stuff.co.uk/rdtest.php

but this will redirect:

http://www.good-stuff.co.uk//rdtest.php

To show the source code, I've put up a plain text version:

http://www.good-stuff.co.uk/rdtest.txt

As you can see, it's dead simple.

Mark
Back to top
  Ads
Advertising
Sponsor


Guy Macon
Guest





PostPosted: Wed Jul 09, 2008 4:25 pm    Post subject: Re: Redirecting double-slash ("//") URLs to single-slash ("/ Reply with quote

Mark Goodge wrote:
Quote:

Guy Macon <http://www.GuyMacon.com/> put finger to keyboard and typed:

Mark Goodge wrote:

If you wanted to fix this particular problem, you'd need to do it by
means of server side scripting. But it's hardly likely to be a
problem, anyway.

I suspect that server side scripting will fail for the same reasons
that .htaccess fails -- I think Apache is mucking with the some URLs
before them on to htaccess *or* server side scripting, but I am more
than willing to try.

No; doing it with scripting will be fine. The actual path is passed to
Apache which in turn passes it on the the script. If your server
supports PHP, then upload a phpinfo() page in various places and try
looking at it with different combinations of slashes in the URL.
You'll see (or should see!) that the actual path shows up in the
SCRIPT_URI and REQUEST_URI variables (among others).

Another interesting thing you'll see, though, with phpinfo() and
multiple slashes is that the additional slashes don't show up in the
SCRIPT_FILENAME variable.

If you were writing a PHP script to redirect on multiple slashes,
therefore, you could fairly easily pick them out by comparing (say)
the SCRIPT_NAME and REQUEST_URI variables and issuing a redirect to
the former if they don't match. You'd need to be careful if you were
using mod_rewrite for other purposes (such as "SEO-friendly URLS") on
the site, as I suspect it could have odd effects, but if you just want
to shift one PHP page back to itself but in the correct location then
this would be suitable. Here's an example on my site. This works as
expected:

http://www.good-stuff.co.uk/rdtest.php

but this will redirect:

http://www.good-stuff.co.uk//rdtest.php

To show the source code, I've put up a plain text version:

http://www.good-stuff.co.uk/rdtest.txt

As you can see, it's dead simple.

I just set up a test on my guymacon.net domain.
It contains:

\robots.txt
\index.php
\rdtest.php
\subdirectory\index.php
\subdirectory\rdtest.php
...and nothing else. No htaccess, no css, etc.


Test results that failed:

http://www.guymacon.net/subdirectory//rdtest.php
no rewrite (bad)

http://www.guymacon.net/subdirectory//index.php
no rewrite (bad)

http://www.guymacon.net/subdirectory// rewrites to
http://www.guymacon.net/subdirectory//index.php (bad)

http://www.guymacon.net//subdirectory//index.php rewrites to
http://www.guymacon.net/subdirectory//index.php (half good, half bad)

http://www.guymacon.net//subdirectory// rewrites to
http://www.guymacon.net/subdirectory//index.php (half good, half bad)


Test results that passed:

http://www.guymacon.net//rdtest.php rewrites to
http://www.guymacon.net/rdtest.php (good)

http://www.guymacon.net//index.php rewrites to
http://www.guymacon.net/index.php (good)

http://www.guymacon.net// rewrites to
http://www.guymacon.net/index.php (good)

http://www.guymacon.net//subdirectory/rdtest.php rewrites to
http://www.guymacon.net/subdirectory/rdtest.php (good)

http://www.guymacon.net//subdirectory/index.php rewrites to
http://www.guymacon.net/subdirectory/index.php (good)

http://www.guymacon.net rewrites to
http://www.guymacon.net/index.php (good)

http://www.guymacon.net/ rewrites to
http://www.guymacon.net/index.php (good)

http://www.guymacon.net/subdirectory rewrites to
http://www.guymacon.net/subdirectory/index.php (good)

http://www.guymacon.net/subdirectory rewrites to
http://www.guymacon.net/subdirectory/index.php (good)

http://www.guymacon.net/rdtest.php no rewrite (good)

http://www.guymacon.net/index.php no rewrite (good)

http://www.guymacon.net/subdirectory/rdtest.php no rewrite (good)

http://www.guymacon.net/subdirectory/index.php no rewrite (good)


--
Guy Macon
<http://www.GuyMacon.com/>
Back to top
  Ads
Advertising
Sponsor


Toby A Inkster
Guest





PostPosted: Thu Jul 17, 2008 4:53 pm    Post subject: Re: Redirecting double-slash ("//") URLs to single-slash ("/ Reply with quote

Mark Goodge wrote:

Quote:
I don't think it can be done with .htaccess, for the reason I've
described.

RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} \/\.?\/
RewriteRule ^(.*)$ $1 [R]

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.24.4-1mnbcustom-g5n1, up 25 days, 19:49.]
[Now Playing: Ed Harcourt - In her own eyes]

Extending hCard with RDFa
http://tobyinkster.co.uk/blog/2008/07/16/hcard-rdfa/
Back to top
  Ads
Advertising
Sponsor


Guy Macon
Guest





PostPosted: Fri Jul 18, 2008 1:46 am    Post subject: Re: Redirecting double-slash ("//") URLs to single-slash ("/ Reply with quote

Toby A Inkster wrote:
Quote:

Mark Goodge wrote:

I don't think it can be done with .htaccess, for the reason I've
described.

RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} \/\.?\/
RewriteRule ^(.*)$ $1 [R]

Nope. The same URLs that fail to redirect using the existing
..htaccess still fail using the above RewriteCond / RewriteRule.
See http://www.guymacon.org/test.html (tests that fail are in *bold*).

You did, however, fix the two test cases that are labeled with
"This is acceptable, but ... would be more desirable." Thanks!

....and, of course, the previous .php suggestion failed for
the same reason all .htaccess solutions fail. It looks to
me like the PHP gets the same bad information from Apache
that the .htaccess gets.
See http://www.guymacon.net/subdirectory//rdtest.php


--
Guy Macon
<http://www.GuyMacon.com/>
Back to top
  Ads
Advertising
Sponsor


Mark Goodge
Guest





PostPosted: Fri Jul 18, 2008 11:02 am    Post subject: Re: Redirecting double-slash ("//") URLs to single-slash ("/ Reply with quote

On Thu, 17 Jul 2008 20:46:28 +0000, Guy Macon
<http://www.GuyMacon.com/> put finger to keyboard and typed:

Quote:



Toby A Inkster wrote:

Mark Goodge wrote:

I don't think it can be done with .htaccess, for the reason I've
described.

RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} \/\.?\/
RewriteRule ^(.*)$ $1 [R]

Nope. The same URLs that fail to redirect using the existing
.htaccess still fail using the above RewriteCond / RewriteRule.
See http://www.guymacon.org/test.html (tests that fail are in *bold*).

You did, however, fix the two test cases that are labeled with
"This is acceptable, but ... would be more desirable." Thanks!

...and, of course, the previous .php suggestion failed for
the same reason all .htaccess solutions fail. It looks to
me like the PHP gets the same bad information from Apache
that the .htaccess gets.
See http://www.guymacon.net/subdirectory//rdtest.php

Put a phpinfo() statement in that page and see what it shows.

Mark
Back to top
  Ads
Advertising
Sponsor


Display posts from previous:   
Post new topic   Reply to topic    Webmaster talk Forum Index -> Webmaster talk All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Australian Debt Consolidation Experts
medical insurance
Wedding Invitation
Recensioni dei principali siti Escort/Accompagnatrici (BestAnnunci, Piccoletrasgressioni, Incontriitalia ...)
UK Swingers Contacts
Chemicals Industry
life insurance quotes
Make Your Own Website
Cheap Phone Calls
Cleaning Service
toxic mold
UK Swingers Genuine Contacts Site
Janitorial Supplies
bissell Parts


Board Security

125 Attacks blocked

Powered by phpBB © 2001, 2005 phpBB Group