How to download the webpages that I want, using HTTRACT ?

Discussion in 'Computer Information' started by Penang, Jun 25, 2003.

  1. Penang

    Penang Guest

    Penang, Jun 25, 2003
    #1
    1. Advertisements

  2. Penang

    Sid Ismail Guest

    Sid Ismail, Jun 25, 2003
    #2
    1. Advertisements

  3. Penang

    Disco Guest

    looks like it is only getting files that have relative URIs. Th Jan 1 - Jan
    30 files have URIs like href="/blah/blah/blah.ext" whereas other URIs have
    href=http://www.freakmeout.example.com/whatthe.freakinfreakshowfreakhead.ext
    ..

    Maybe try looking for something such as "Only download file on the web site"
    or "Do not download external files" in the help of the application.

    Also, maybe because you are going to the ip address (202.186.86.35) instead
    of the domain (www.freakenfreakfreaker.example.com)
     
    Disco, Jun 26, 2003
    #3
  4. Penang

    Penang Guest


    I did. I tried all the different settings. I even turned off the
    "robot.txt" setting.

    But HTTRACT still won't do the simple thing like getting the Jan 30 to
    Jan 1 links from the page.

    Instead, it got all types of unrelated page AWAY from the
    http://202.186.86.35/english/jan2002.asp page.

    Just don't know what to do next.

    Anyone has any suggestion?

    Thanks !
     
    Penang, Jun 26, 2003
    #4
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.