Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > HTML Parser - problem with multiple instances

Reply
Thread Tools

HTML Parser - problem with multiple instances

 
 
Matt
Guest
Posts: n/a
 
      04-29-2005
I have a parser program which queries a online shopping comparison web
page and extracts the information needed. I am trying to run this
program with different search terms which are created by entering a
sentence, so each one is sent separately, however the outputs (text
files) are the same for each word, despite the correct term and output
file seeming passed. I suspect it might be that the connection is not
being closed each time but am not sure why this is happening.

If i create an identical copy of the program and run that after the
first one it works but this is not an appropriate solution.

Any help would be much appreciated. Here is some of my code, if more
is required i will post.

To run the program:

StringTokenizer t = new StringTokenizer("red green yellow", " ");
int c = 0;
Parser1 p = new Parser1();
while (t.hasMoreTokens()) {
c++;
String tok = t.nextToken();

File tem = new File("C:/"+c+".txt");

p.mainprog(tok, tem);
p.mainprog(tok, tem)

p.mainprog(tok, tem);
}

The parser:

import javax.swing.text.html.parser.*;
import javax.swing.text.html.*;
import javax.swing.text.*;
import java.awt.*;
import java.util.*;
import javax.swing.*;
import java.io.*;
import java.net.*;

public class Parser1 extends HTMLEditorKit.ParserCallback {

variable declarations

public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int
pos){
...methods
}
public void handleText(char[] data, int pos){
...methods
}
public void handleTitleTag(HTML.Tag t, char[] data){

}

public void handleEmptyTag(HTML.Tag t, char[] data){

}

public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int
pos){

...methods
}
static void mainprog(String term, File file) {

....proxy and authentication methods


Authenticator.setDefault(new MyAuthenticator() );

HTMLEditorKit editorKit = new HTMLEditorKit();
HTMLDocument HTMLDoc;
Reader HTMLReader;

try {
String temp = new String(term);
String fullurl = new String(MainUrl+temp);
url = new URL(fullurl);
InputStream myInStream;
myInStream =
url.openConnection().getInputStream();
HTMLReader = (new
InputStreamReader(myInStream));
HTMLDoc = (HTMLDocument)
editorKit.createDefaultDocument();
HTMLDoc.putProperty("IgnoreCharsetDirective",
new Boolean(true));

ParserDelegator parser = new
ParserDelegator();
HTMLEditorKit.ParserCallback callback = new
Parser1();
parser.parse(HTMLReader, callback, true);

callback.flush();

HTMLReader.close();
myInStream.close();


}

catch (IOException IOE) {
IOE.printStackTrace();
}
catch (Exception e) {
e.printStackTrace();
}

try {
FileWriter writer = new FileWriter(file);
BufferedWriter bw = new BufferedWriter(writer);
for (int i = 0; i < vect.size(); i++){

bw.write((String)vect.elementAt(i));
if (vect.elementAt(i)!=vect.lastElement()){
bw.newLine();
}
}

bw.flush();
bw.close();
writer.close();
}
catch (IOException IOE) {
IOE.printStackTrace();
}
catch (Exception e) {
e.printStackTrace();
}

} catch (IOException IOE) {
System.out.println("User options not found.");
}


}
}
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Newbie help- Can multiple instances with multiple namesautomatically created. Nav Python 15 01-05-2010 06:03 AM
dicts,instances,containers, slotted instances, et cetera. ocschwar@gmail.com Python 8 01-29-2009 09:52 AM
XML Parser VS HTML Parser ZOCOR Java 11 10-05-2004 01:58 PM
list of class instances within a list of a class instances John Wohlbier Python 2 02-22-2004 08:41 AM
HTML-Parser / SGML-Parser Zach Dennis Ruby 5 10-01-2003 07:26 PM



Advertisments