Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > using sockets to open connection to a search engine

Reply
Thread Tools

using sockets to open connection to a search engine

 
 
Damo
Guest
Posts: n/a
 
      01-15-2007
Hi,
I'm trying to open a connection to altavista.com through java to
retrieve the search results for a query. This is the code I'm using, it
works for google and yahoo but not altavista or MSN.

s = new Socket("altavista.com",80);
p = new PrintStream(s.getOutputStream());
p.print("GET /web/results?q=java HTTP/1.0\r\n");
p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
p.print("Connection: close\r\n\r\n");
in = s.getInputStream();


If you type this : www.altavista.com/web/results?q=java into
the address bar, it will return the result page.

Can anyone help me
Thanks

 
Reply With Quote
 
 
 
 
=?ISO-8859-1?Q?Arne_Vajh=F8j?=
Guest
Posts: n/a
 
      01-15-2007
Damo wrote:
> Hi,
> I'm trying to open a connection to altavista.com through java to
> retrieve the search results for a query. This is the code I'm using, it
> works for google and yahoo but not altavista or MSN.
>
> s = new Socket("altavista.com",80);
> p = new PrintStream(s.getOutputStream());
> p.print("GET /web/results?q=java HTTP/1.0\r\n");
> p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
> rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
> p.print("Connection: close\r\n\r\n");
> in = s.getInputStream();
>
> If you type this : www.altavista.com/web/results?q=java into
> the address bar, it will return the result page.


Put something in between your browser and AltaVista and
see what the browser sends.

You already have User-Agent, but maybe it wants Referrer or
Accept or Accept-Language or Accept-Encoding.

Or maybe it wants HTTP/1.1 (which requires Host).

There is a limited number of things to add until
you are fully browser compatible.

Arne
 
Reply With Quote
 
 
 
 
Damo
Guest
Posts: n/a
 
      01-15-2007

sorry, I meant to say the error was a 404 , resource not found on this
server.
so its connecting but not returning the results

 
Reply With Quote
 
Tom Hawtin
Guest
Posts: n/a
 
      01-15-2007
Damo wrote:
>
> If you type this : www.altavista.com/web/results?q=java into
> the address bar, it will return the result page.


This seems to work (once I managed to spell alta-vista with both Ts -
shouldn't have repeated myself):

import java.io.*;
import java.net.*;

class Search {
public static void main(String[] args) throws Exception {
Socket s = new Socket("www.altavista.com",80);
String request =
"GET /web/results?q=java HTTP/1.1\r\n"+
"Host: www.altavista.com:80\r\n"+
"User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;rv:1.8.1)
Gecko/20061010 Firefox/2.\r\n"+
"Connection: close\r\n\r\n";
OutputStream out = s.getOutputStream();
out.write(request.getBytes());
out.flush();
InputStream in = s.getInputStream();
for (; {
int b = in.read();
if (b == -1) { break; }
System.out.print((char)b);
}
}
}

Tom Hawtin
 
Reply With Quote
 
Damo
Guest
Posts: n/a
 
      01-15-2007
excellent, cheers, that did the trick

 
Reply With Quote
 
Martin Gregorie
Guest
Posts: n/a
 
      01-16-2007
Damo wrote:
> Hi,
> I'm trying to open a connection to altavista.com through java to
> retrieve the search results for a query. This is the code I'm using, it
> works for google and yahoo but not altavista or MSN.
>
> s = new Socket("altavista.com",80);
> p = new PrintStream(s.getOutputStream());
> p.print("GET /web/results?q=java HTTP/1.0\r\n");
> p.print("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
> rv:1.8.1) Gecko/20061010 Firefox/2.\r\n");
> p.print("Connection: close\r\n\r\n");
> in = s.getInputStream();
>
>
> If you type this : www.altavista.com/web/results?q=java into
> the address bar, it will return the result page.
>
> Can anyone help me
> Thanks
>

Try opening the socket to "www.altavista.com"

Its not the same host as "altavista.com". You can see the difference by
pinging them both and looking at the IPs and true host names.


--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Getting search details from web search engine pandi Java 5 12-14-2009 04:45 AM
Search jobs at world's largest online job search engine. Findresources for all types of solomanjo@gmail.com Computer Support 0 03-13-2008 12:00 PM
Search jobs at world's largest online job search engine. Findresources for all types of solomanjo@gmail.com Digital Photography 0 03-13-2008 11:49 AM
.Net Search Engine - Has anyone used dtSearch .Net Engine? Sasha ASP .Net 3 05-22-2007 04:20 PM
| SEO , Search Engine Optimizer, SEARCH OPtiMIzAtIoN with SeaRch OPtiMizer optimizer.seo@gmail.com Digital Photography 0 04-22-2007 04:20 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57