1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Saving Web Page Issue

Discussion in 'Internet Explorer' started by David, Mar 28, 2010.

  1. David

    David Flightless Bird

    When saving a web page, IE 6.0.2900.5512.xpsp_sp3 (and also Firefox 3.6)
    will create a "_files" directory and store the images, css and js files in
    that directory.

    However, -- more often than not -- the htm or html source code is NOT
    modified to reflect the "_files" directory. It appears to maintain
    the original directory on the server for images, css, and js files.

    Questions:

    1) Is this a function of IE (and/or firefox)?
    2) If so can it be corrected and how?
    3) If not why, can you please provide an explanation as to why the link in
    the html source is not being changed?
    4) Is there any setting I can set in XP (pro in this case) that
    will correct this?
    5) Is there any other solution?

    Thanks
    David
     
  2. Nathan Sokalski

    Nathan Sokalski Flightless Bird

    The reason it does not modify the *.htm, *.html, or whatever extension it
    gets saved with, file is for several reasons:

    1. The purpose of the _files directory is primarily for cache, so in the
    browser's mind, there is not need to modify the source code.
    2. The save function simply copies the file from the location the browser
    put it when you viewed it to whatever location you are now specifying; no
    parsing is actually done during the save process, and therefore the browser
    does not actually look at the text in the file.
    3. Because many images, css, and js files are dynamically generated (in the
    *.htm, *.html, *.aspx, etc. file it doesn't actually end with *.gif, *.jpg,
    *.css, *.js, etc.), looking at the html code might make it hard for the
    browser to figure out what to do.

    If you really want to have a copy of a page that is completely local to your
    machine, I would suggest that you edit the source code yourself, it's not
    that hard, especially if you have any basic html editor (or even that hard
    without one using Notepad if you've ever even touched html before).
    --
    Nathan Sokalski
    njsokalski@hotmail.com
    http://www.nathansokalski.com/

    "David" <NoWhere@earthlink.net> wrote in message
    news:-OJhqVAtzKHA.4384@TK2MSFTNGP06.phx.gbl...
    > When saving a web page, IE 6.0.2900.5512.xpsp_sp3 (and also Firefox 3.6)
    > will create a "_files" directory and store the images, css and js files in
    > that directory.
    >
    > However, -- more often than not -- the htm or html source code is NOT
    > modified to reflect the "_files" directory. It appears to maintain
    > the original directory on the server for images, css, and js files.
    >
    > Questions:
    >
    > 1) Is this a function of IE (and/or firefox)?
    > 2) If so can it be corrected and how?
    > 3) If not why, can you please provide an explanation as to why the link
    > in the html source is not being changed?
    > 4) Is there any setting I can set in XP (pro in this case) that
    > will correct this?
    > 5) Is there any other solution?
    >
    > Thanks
    > David
    >
     
  3. David

    David Flightless Bird

    Mr. Sokalski:

    Thanks for the explanation and your time. Some followup questons if I
    may.

    1)

    If the browser(s) -- IE or Firefox in this case -- is creating the "_files"
    directrory as a cache directory
    (which makes perfect sense), if the html source does not not include this
    reference, how does the browser know where to find the "cached files" -- so
    they can be displayed -- when the page is downloaded onto the client
    machine?

    2)
    Why would the source on some webpages contain the "_files" directory
    reference and others not
    -- unless -- the html downloaded to the client was hard coded this way?

    3)
    Curious as to where you found this info, as searched everything I could
    thing on net and came up with zippo,
    even MSDN. Any link or reference to explain -- in detail -- how the
    browser handles this would be appreciated?

    4)
    ------------------
    > If you really want to have a copy of a page that is completely local to
    > your machine, I would suggest that you edit the source code yourself

    -----------------

    This is what I've been doing. Sometimes an easy fix, other times not.
    Will write a parsing routine to automate --if possible, but need to
    understand how the browser is handling this first.

    David


    "Nathan Sokalski" <njsokalski@hotmail.com> wrote in message
    news:24DE1E95-7ABB-461C-A924-3D5C4C1E2D2B@microsoft.com...
    > The reason it does not modify the *.htm, *.html, or whatever extension it
    > gets saved with, file is for several reasons:
    >
    > 1. The purpose of the _files directory is primarily for cache, so in the
    > browser's mind, there is not need to modify the source code.
    > 2. The save function simply copies the file from the location the browser
    > put it when you viewed it to whatever location you are now specifying; no
    > parsing is actually done during the save process, and therefore the
    > browser does not actually look at the text in the file.
    > 3. Because many images, css, and js files are dynamically generated (in
    > the *.htm, *.html, *.aspx, etc. file it doesn't actually end with *.gif,
    > *.jpg, *.css, *.js, etc.), looking at the html code might make it hard for
    > the browser to figure out what to do.
    >
    > If you really want to have a copy of a page that is completely local to
    > your machine, I would suggest that you edit the source code yourself, it's
    > not that hard, especially if you have any basic html editor (or even that
    > hard without one using Notepad if you've ever even touched html before).
    > --
    > Nathan Sokalski
    > njsokalski@hotmail.com
    > http://www.nathansokalski.com/
    >
    > "David" <NoWhere@earthlink.net> wrote in message
    > news:-OJhqVAtzKHA.4384@TK2MSFTNGP06.phx.gbl...
    >> When saving a web page, IE 6.0.2900.5512.xpsp_sp3 (and also Firefox 3.6)
    >> will create a "_files" directory and store the images, css and js files
    >> in that directory.
    >>
    >> However, -- more often than not -- the htm or html source code is NOT
    >> modified to reflect the "_files" directory. It appears to maintain
    >> the original directory on the server for images, css, and js files.
    >>
    >> Questions:
    >>
    >> 1) Is this a function of IE (and/or firefox)?
    >> 2) If so can it be corrected and how?
    >> 3) If not why, can you please provide an explanation as to why the link
    >> in the html source is not being changed?
    >> 4) Is there any setting I can set in XP (pro in this case) that
    >> will correct this?
    >> 5) Is there any other solution?
    >>
    >> Thanks
    >> David
    >>
     
  4. Donald Anadell

    Donald Anadell Flightless Bird

    "David" <NoWhere@earthlink.net> wrote in message
    news:-OJhqVAtzKHA.4384@TK2MSFTNGP06.phx.gbl...
    > When saving a web page, IE 6.0.2900.5512.xpsp_sp3 (and also Firefox 3.6)
    > will create a "_files" directory and store the images, css and js files in
    > that directory.
    >
    > However, -- more often than not -- the htm or html source code is NOT
    > modified to reflect the "_files" directory. It appears to maintain
    > the original directory on the server for images, css, and js files.
    >
    > Questions:
    >
    > 1) Is this a function of IE (and/or firefox)?
    > 2) If so can it be corrected and how?
    > 3) If not why, can you please provide an explanation as to why the link
    > in the html source is not being changed?
    > 4) Is there any setting I can set in XP (pro in this case) that
    > will correct this?


    > 5) Is there any other solution?


    WinHTTrack

    http://www.httrack.com/page/1/en/index.html

    "It allows you to download a World Wide Web site from the Internet to a
    local directory, building recursively all directories, getting HTML, images,
    and other files from the server to your computer.

    HTTrack arranges the original site's relative link-structure. Simply open a
    page of the "mirrored" website in your browser, and you can browse the site
    from link to link, as if you were viewing it online. HTTrack can also update
    an existing mirrored site, and resume interrupted downloads. HTTrack is
    fully configurable, and has an integrated help system."

    Donald Anadell


    >
    > Thanks
    > David
    >
     
  5. David

    David Flightless Bird

    Thanks

    Didn't know there was anything in GPL I could look at.

    David

    "Donald Anadell" <danadell@nospamersmikrotec.com> wrote in message
    news:%23Fpaz0zzKHA.6112@TK2MSFTNGP05.phx.gbl...
    >
    > "David" <NoWhere@earthlink.net> wrote in message
    > news:-OJhqVAtzKHA.4384@TK2MSFTNGP06.phx.gbl...
    >> When saving a web page, IE 6.0.2900.5512.xpsp_sp3 (and also Firefox 3.6)
    >> will create a "_files" directory and store the images, css and js files
    >> in that directory.
    >>
    >> However, -- more often than not -- the htm or html source code is NOT
    >> modified to reflect the "_files" directory. It appears to maintain
    >> the original directory on the server for images, css, and js files.
    >>
    >> Questions:
    >>
    >> 1) Is this a function of IE (and/or firefox)?
    >> 2) If so can it be corrected and how?
    >> 3) If not why, can you please provide an explanation as to why the link
    >> in the html source is not being changed?
    >> 4) Is there any setting I can set in XP (pro in this case) that
    >> will correct this?

    >
    >> 5) Is there any other solution?

    >
    > WinHTTrack
    >
    > http://www.httrack.com/page/1/en/index.html
    >
    > "It allows you to download a World Wide Web site from the Internet to a
    > local directory, building recursively all directories, getting HTML,
    > images, and other files from the server to your computer.
    >
    > HTTrack arranges the original site's relative link-structure. Simply open
    > a page of the "mirrored" website in your browser, and you can browse the
    > site from link to link, as if you were viewing it online. HTTrack can also
    > update an existing mirrored site, and resume interrupted downloads.
    > HTTrack is fully configurable, and has an integrated help system."
    >
    > Donald Anadell
    >
    >
    >>
    >> Thanks
    >> David
    >>

    >
    >
     
  6. Twayne

    Twayne Flightless Bird

    In news:-OJhqVAtzKHA.4384@TK2MSFTNGP06.phx.gbl,
    David <NoWhere@earthlink.net> typed:
    > When saving a web page, IE 6.0.2900.5512.xpsp_sp3 (and also
    > Firefox 3.6) will create a "_files" directory and store the
    > images, css and js files in that directory.
    >
    > However, -- more often than not -- the htm or html source
    > code is NOT modified to reflect the "_files" directory. It
    > appears to maintain the original directory on the server for
    > images, css, and
    > js files.
    > Questions:
    >
    > 1) Is this a function of IE (and/or firefox)?
    > 2) If so can it be corrected and how?
    > 3) If not why, can you please provide an explanation as to
    > why the link in the html source is not being changed?
    > 4) Is there any setting I can set in XP (pro in this case)
    > that will correct this?
    > 5) Is there any other solution?
    >
    > Thanks
    > David


    Not certain, but the one time I looked at that, I think it
    turned out to be whether the original code used Relative or
    Direct links. If Relative, they'll work on your computer. If
    not relative, then clicking the links or displaying images
    etc. might result in an attempt to retrieve them from the
    original web site or nothing, depending again on coding sytles
    and what you've set your firewall to prevent.

    HTH,

    Twayne`
     

Share This Page