Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Improving performance of code

Reply
Thread Tools

Improving performance of code

 
 
ruds
Guest
Posts: n/a
 
      04-07-2007
Hi,
I'm reading a file and doing some operations on it..It is a huge file
going in GB's.....
The code is working correctly but is very slow....How do i optimise
it...
My code snipnet is:
class Risk
{
public void compare(String infile) throws IOException
{
cnt=0;
for(i=0;i<qid.size();i++)
{
no=0;
fr=new FileReader(infile);
br=new BufferedReader(fr);
while((str=br.readLine())!=null)
{
no++;
if((str.startsWith("$"))||(str.startsWith("-CONT-")))
continue;
else
{
s2=str.substring(0,10);
if(s2.equals(qid.elementAt(i)))
{
cnt++;
start=no;
end=no+29;
quadarray(infile,start,end);
}

if((cnt==sc) && (i<qid.size()))
{
System.out.println("qid="+qid.elementAt(i));
cnt=0;
writesubcase1();
}
}
}
fr.close();

}

for(i=0;i<tid.size();i++)
{
no=0;
fr=new FileReader(infile);
br=new BufferedReader(fr);
while((str=br.readLine())!=null)
{
no++;
if((str.startsWith("$"))||(str.startsWith("-CONT-")))
continue;
else
{
s2=str.substring(0,10);
if(s2.equals(tid.elementAt(i)))
{
cnt++;
start=no;
end=no+29;
triaarray(infile,start,end);
}
if((cnt==sc) && (i<tid.size()))
{
System.out.println("tid="+tid.elementAt(i));
cnt=0;
writesubcase2();
}
}
}
fr.close();
}
}

public void quadarray(String ifile,int start,int end) throws
IOException
{
try
{
fr1=new FileReader(ifile);
br1=new BufferedReader(fr1);
line=0;
k=0;
x=0;
while((str1=br1.readLine())!=null)
{
line++;
if((line>=start) && (line<end))
{
if(j==0)
quad[j][k]=str1;
if((k==3) ||(k==17)||(k==20))
{
val1=Double.parseDouble(str1.substring(18,36));
if(val1>qmax[x])
{
qmax[x]=val1;
x++;
}
}
if((k==5) ||(k==||(k==22)||(k==25))
{
val2=Double.parseDouble(str1.substring(54,72));
if(val2>qmax[x])
{
qmax[x]=val2;
x++;
}
}
if((k==11)||(k==14)||(k==2)
{
val3=Double.parseDouble(str1.substring(36,54));
if(val3>qmax[x])
{
qmax[x]=val3;
x++;
}
}
k++;
}
}
}
catch (Exception e)
{ }
}

public void writesubcase1() throws IOException
{
x=0;
try
{
fw=new FileWriter("Result.txt",true);
for(y=0;y<30;y++)
{
if((y==0)||(y==1)||(y==2)||(y==4)||(y==6)||(y==7)| |(y==9)||
(y==10)||(y==12)||(y==13)||(y==15) || (y==16)||(y==1||(y==19)||
(y==21)||(y==23)||(y==24) || (y==26)||(y==27))
{
fw.write(quad[0][y]+"\n");
continue;
}
else
{
if((y==3)||(y==17)||(y==20))
{
s=quad[0][y];
fw.write(s.substring(0,2+qmax[x]+s.substring(37)+"\n");
x++;
continue;
}
if((y==5)||(y==||(y==22)||(y==25))
{
s=quad[0][y];
fw.write(s.substring(0,64)+qmax[x]+"\n");
x++;
continue;
}
if((y==11)||(y==14))
{
s=quad[0][y];
fw.write(s.substring(0,46)+qmax[x]+s.substring(55)+"\n");
x++;
continue;
}
if(y==2
{
s=quad[0][y];
fw.write(s.substring(0,46)+qmax[x]+"\n");
x++;
break;
}
}
}
fw.close();
}
catch(Exception e)
{}
}

public void triaarray(String ifile,int start,int end) throws
IOException
{
try
{
fr1=new FileReader(ifile);
br1=new BufferedReader(fr1);
line=0;
while((str1=br1.readLine())!=null)
{
line++;
if((line>=start) && (line<end))
{
if(j==0)
tria[j][k]=str1;
if(k==2)
{
val1=Double.parseDouble(str1.substring(37,54));
if(val1>tmax[0])
tmax[0]=val1;
}
if(k==5)
{
val2=Double.parseDouble(str1.substring(19,36));
if(val2>tmax[1])
tmax[1]=val2;
}
k++;
}
}
}
catch(Exception e)
{}
}

public void writesubcase2()
{
try
{
fw=new FileWriter("Result.txt",true);
for(y=0;y<7;y++)
{
if((y==0)||(y==1)||(y==3)||(y==4))
{
fw.write(tria[0][y]+"\n");
continue;
}
if(y==2)
{
s=tria[0][y];
fw.write(s.substring(0,47)+tmax[0]+s.substring(55)+"\n");
continue;
}
if(y==5)
{
s=tria[0][y];
fw.write(s.substring(0,29)+tmax[1]+"\n");
break;
}
}
fw.close();
}
catch(Exception e)
{}
}

public static void main(String args[])
{
Risk r=new Risk();
ipfile=args[0];

try
{
r.compare(ipfile);
}
catch (Exception e)
{ }
}
}

The code takes a lot of time in functions Quadarray and Triaaray.
As u can see the de is very simple in these functions but still it
takes lot of time...

How do i improve it??

 
Reply With Quote
 
 
 
 
Esmond Pitt
Guest
Posts: n/a
 
      04-07-2007
ruds wrote:

> How do i improve it??


1. I don't see any need to read the files twice. Read them once each,
and look for both subcases on each line. This will double your speed. If
the output comes out in the wrong order, sort it later. BTW you should
be closing 'br' not 'fr' in this loop.

2. The loops on 'y' in the writesubcaseN() and xxxarray() methods seem
pretty pointless, as you do different things depending on the value of
'y'. Unroll these loops. You could use a lookup table to give you the
various offsets you need, and just loop over the lookup table. Or else
use a switch statement instead of all the tests on 'y'.

3. The triarray() and quadarray() methods probably spend most of their
time catching up to where you already are in the file. Do you really
need to do this?
 
Reply With Quote
 
 
 
 
ruds
Guest
Posts: n/a
 
      04-07-2007
On Apr 7, 10:03 am, Esmond Pitt <(E-Mail Removed)>
wrote:
> ruds wrote:
> > How do i improve it??

>
> 1. I don't see any need to read the files twice. Read them once each,
> and look for both subcases on each line. This will double your speed. If
> the output comes out in the wrong order, sort it later. BTW you should
> be closing 'br' not 'fr' in this loop.
>
> 2. The loops on 'y' in the writesubcaseN() and xxxarray() methods seem
> pretty pointless, as you do different things depending on the value of
> 'y'. Unroll these loops. You could use a lookup table to give you the
> various offsets you need, and just loop over the lookup table. Or else
> use a switch statement instead of all the tests on 'y'.
>
> 3. The triarray() and quadarray() methods probably spend most of their
> time catching up to where you already are in the file. Do you really
> need to do this?


For the 1 & 2 sugestion points i did get those..but for the 3 point I
dont have any other way out..atleast from my point of view
If u can suggest me smthing better than this ur welcome...
I'm a newbie at handling files...
Thanx a lot.

 
Reply With Quote
 
Mike Schilling
Guest
Posts: n/a
 
      04-07-2007

"ruds" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ups.com...
>
> How do i improve it??


Indent it and comment it, for a start. In its current state, it's
unreadable.


 
Reply With Quote
 
Chris Uppal
Guest
Posts: n/a
 
      04-07-2007
Mike Schilling wrote:

> > How do i improve it??

>
> Indent it and comment it, for a start. In its current state, it's
> unreadable.


The apparent lack of indentation is a bug in the newsreader you (and I) are
using, not a deficiency in the posted source.

-- chris



 
Reply With Quote
 
Chris Uppal
Guest
Posts: n/a
 
      04-07-2007
ruds wrote:

> I'm reading a file and doing some operations on it..It is a huge file
> going in GB's.....
> The code is working correctly but is very slow....How do i optimise
> it...


I found your code difficult to follow, you could improve it by using case
statements instead of lots of if-s, by returning from functions as soon as you
know the there is nothing else to do (rather than having the "real" code buried
inside several nested if-s), and above all (as Mike has already mentioned) by
commenting it properly.

So, it's quite possible that I've misread or misunderstood what the code is
doing, but if I /haven't/ got it wrong, then I'm puzzled by what quadarray() is
doing (and the other similar methods). I /looks/ as if it loops over the
entire (huge) input file, keeping count of which line it's looking at (in
variable 'k' -- /not/ a good name, unless there's something special in the
domain which makes 'k' self-explanatory), and only doing anything with certain
numbered lines, 20, 14, 28, and so on. But if that's true, then it doesn't do
anything at all with lines > 28, so there is no point in looping over the
remaining lines in the input file.

If I'm wrong about that (i.e. if you do have to read data from every, or nearly
every, line of the big files), and if Daniel's suggestion about reducing the
number of passes isn't suitable, then I don't think there's very much you can
do to speed it up. If I /had/ to maximise the speed of something like this,
then I'd first try to work out what was the fastest I could possibly scan data
from the files, by writing a small test program which read in all the data as
/binary/ (so there are no conversion costs), and which didn't do anything with
the data. That would give me a baseline so I could tell whether there was any
reasonable speedup available even in theory (there might not be). If that did
turn out to be significantly, /and usefully/, faster than my current code, then
I'd consider (i.e do a few experiments with), doing most of the processing as
binary. It seems to me that you don't use most of the data on most lines, so
if you can scan the data as binary, and only incur the expense of converting
the data you actually need into text, then you might be able to save some time.
But there again, it might make almost no difference. Only measurement will
tell you (or an analytic, numeric, understanding of the performance could do
tell you too, but that would require data that I don't have here, and I suspect
you don't have either).

BTW, this sounds like one of the examples where profiling is unlikely to be
very helpful (like many examples of using profiling, in my experience).
Profiling is an excellent tool if you have an unexpected hot-spot in your code
which you don't realise is there -- it will point out your error with
devastating clarity. But that situation's not too likely to happen to
competent programmers[*]. The other case where profiling is useful is where
you have a reasonable idea of how long things /should/ take, and you can use
profiling to attach actual numbers to your mental model of the performance.

Oh, another thing that's often worth a try (if you are on Windows or some other
OS which supports transparent compression in the filesystem), is to tell the OS
to compress the data. If your program is primarily IO bound, rather than CPU
bound (which sounds likely in your case -- and it's easy for you to check),
then compressing the data will reduce the amount of data which has to be read
off-disk, albeit at the expense of more processing, which can sometimes be a
useful saving.

-- chris
[*] but it never hurts to check, even so -- if you have time...



 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      04-07-2007
Mike Schilling wrote:
>> Indent it and comment it, for a start. In its current state, it's
>> unreadable.


Chris Uppal wrote:
> The apparent lack of indentation is a bug in the newsreader you (and I) are
> using, not a deficiency in the posted source.


I'm using Thunderbird. I see the original post's indentation, and that it was
done with the TAB character.

No doubt the space character would not have caused such difficulties. Even
though I can see the indentation, the TAB character makes it so wide as to
damage readability.

So either way, OP, using TABs to indent Usenets posts is a Bad Thing.

--
Lew
 
Reply With Quote
 
Patricia Shanahan
Guest
Posts: n/a
 
      04-07-2007
Lew wrote:
> Mike Schilling wrote:
>>> Indent it and comment it, for a start. In its current state, it's
>>> unreadable.

>
> Chris Uppal wrote:
>> The apparent lack of indentation is a bug in the newsreader you (and
>> I) are
>> using, not a deficiency in the posted source.

>
> I'm using Thunderbird. I see the original post's indentation, and that
> it was done with the TAB character.
>
> No doubt the space character would not have caused such difficulties.
> Even though I can see the indentation, the TAB character makes it so
> wide as to damage readability.
>
> So either way, OP, using TABs to indent Usenets posts is a Bad Thing.
>


I am not that worried about the indentation, because if I get serious
about looking at posted program I copy it into Eclipse and click
Source-Format.

I do think the first step in a performance campaign should be making
sure the code is properly commented, as well as having meaningful
identifiers, no arbitrary, unexplained constants etc. The big
improvements usually depend on understanding the code, so that data
structures and algorithms can be changed.

Patricia
 
Reply With Quote
 
Greg R. Broderick
Guest
Posts: n/a
 
      04-07-2007
"ruds" <(E-Mail Removed)> wrote in news:1175917594.394914.13880
@d57g2000hsg.googlegroups.com:

> How do i improve it??
>


1. USE MEANINGFUL VARIABLE NAMES (i.e. more that just a single letter)!


2. Pay attention to horizontal white space -- makes code a LOT easier to
read if there are spaces. Use:

if ((str.startsWith("$")) || (str.startsWith("-CONT-")))

or

if ((str.startsWith("$")) ||
(str.startsWith("-CONT-")))


instead of

if((str.startsWith("$"))||(str.startsWith("-CONT-")))


3. Declare ALL of your variables before you use them. In quadarray() it
appears to me that the variables "j", "quad", "str1", "val1", "qmax", "val2"
are used without having been previously declared.


Just a few suggestions that will prevent your name being cursed by those who
come after you and maintain your code.

Cheers!

--
---------------------------------------------------------------------
Greg R. Broderick (E-Mail Removed)

A. Top posters.
Q. What is the most annoying thing on Usenet?
---------------------------------------------------------------------
 
Reply With Quote
 
Lars Enderin
Guest
Posts: n/a
 
      04-07-2007
Chris Uppal skrev:
> Mike Schilling wrote:
>
>>> How do i improve it??

>> Indent it and comment it, for a start. In its current state, it's
>> unreadable.

>
> The apparent lack of indentation is a bug in the newsreader you (and I) are
> using, not a deficiency in the posted source.
>

Thunderbird shows all of the tabs, which should have been replaced by
two or maybe three spaces each. It certainly was indented.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
improving Java Applet performance yawnmoth Java 6 08-05-2008 07:36 PM
improving performance of python webserver running python scripts incgi-bin Dale Python 2 01-11-2008 06:02 PM
improving the performance of my PC LouisB Digital Photography 22 10-23-2006 10:11 AM
Improving Performance in ASP.Net Robin ASP .Net 1 03-04-2005 10:16 PM
Improving VPN performance Frank Toner Cisco 4 04-27-2004 04:08 PM



Advertisments