UrlParser


Relative and Absolute Urls

Creation
Example
  1. Relative Urls
  2. Substitute the ..
  1. /index.html
  2. ../../htmlparser/../../../index.html

Relative Urls
Base Url: http://demo.borland.com/htmlparser/UrlsParser/UrlParser.asp

  1. /index.html
  2. UrlParser.asp
  1. Index.html
  2. UrlParser.asp

Requests to be sent to the server

  1. GET /index.html
  2. GET /htmlparser/UrlParser/UrlParser.asp

Substitute the ..

Problems that may occur: A sequence of '../' may address a non existing node above the root node '/'. Both Browser Netscape and IExplorer can handle this situtation correctly.
IExplorer treats each '\' like a '/'. Netscape treats '\' as characters.

  1. ../../../../../../../../../../../htmlparser/index.html
  2. /htmlparser/../../../../../../../../../../index.html
  3. ../..\../..\../..\../index.html
  4. ../..\/\/..\..//index.html
  5. /htmlparser/UrlParser/../\//..\/\/../index.html
  6. /htmlparser/lexer/applets/../..\/index.html
  1. Index.html
  2. index.html
  3. index.html
  4. index.html
  5. index.html
  6. index.html

Requests to be sent to the server
Base Url: 'http://demo.borland.com/htmlparser/UrlParser/'

IExplorer
Netscape
  1. GET /htmlparser/index.html
  2. GET /index.html
  3. GET /index.html
  4. GET ///index.html
  5. GET /htmlparser///htmlparser///index.html
  6. GET /htmlparser//index.html
  1. GET /htmlparser/index.html
  2. GET /index.html
  3. GET /htmlparser/..\../..\../..\../index.html
  4. GET /htmlparser/..\/\/..\..//index.html
  5. GET /htmlparser/\/Caching\/index.html
  6. GET /htmlparser/lexer/..\/index.html

Interpretation of results:


Nonconforming Urls

Creation
Example
  1. Relative urls with backslashes
  2. Absolute urls with backslashes
  1. ..\tmp\temp02.dhtml
  2. http:\\demo.borland.com\Embeds.html

Relative urls with backslashes:
IExplorer treats '\' like '/'. Netscape detects a nonconforming url, takes the documents base url and concatenates it to the the url specified in the tag.

  1. Embedded image:
    <img src="..\..\images\bg_logo_segue.gif">
  2. Hyperlink:
    <a href=..\..\index.html>index.html</a>
  3. Form:
    <form action="..\..\data2html.asp method=get>
        <input type=text name=Unleashed>
        <input type=submit>
    </form>
  1. alternative

  2. index.html



The requests sent to the server should look like:

IExplorer
Netscape
  1. GET /bg_logo_segue.gif

  2. GET /index.html

  3. GET /data2html.asp?Unleashed=SAMS

  1. GET /htmlparser/UrlParser/..\..\bg_logo_segue.gif

  2. GET /htmlparser/UrlParser/..\..\index.html

  3. GET /htmlparser/UrlParser/..\..\data2html.asp?Unleashed=SAMS


Absolute urls with backslashes
IExplorer can handle urls where '/' are replaced by '\'. Netscape starts processing the url by consuming 'http:'. Then detects a nonconforming path and reacts as follows. It takes the base url and concatenates the rest of the url specified in the html tag. The rest of url that has to be parsed as well as demonstrated in link 3.
When the backslashes occur within urls Netscape simply consumes them as normal characters. No weird url handling necessary.

  1. http:\\demo.borland.com\htmlparser\index3.html

  2. http://demo.borland.com/htmlparser\index3.html

  3. http:\\demo.borland.com/htmlparser/../index.html

  1. Index3.html

  2. Index3.html

  3. index.html

The requests sent to the server should look like:

IExplorer
Netscape
  1. GET /htmlparser/index3.html
  2. GET /htmlparser/index3.html
  3. GET /index.html
  1. GET /htmlparser/UrlParser/\\demo.borland.com\htmlparser\index3.html
  2. GET /htmlparser\index3.html
  3. GET /htmlparser/UrlParser/\\demo.borland.com/index.html

<a href="http://ftpuser:ftppass@/></a>
Absolute link to

updated: marko /24/10/00