Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Java OCR ?

Reply
Thread Tools

Java OCR ?

 
 
Soefara
Guest
Posts: n/a
 
      09-18-2003
Is it just me or no Java OCR package exists ?

I've seen one reference - www.javaocr.com - but if you download the
demo, it's actually a 33KB .jar file which in turn calls a 220KB DLL
(which will run on Windows only). Maybe I'm misunderstanding
something, but this looks like a bit misleading.

Even PHP apparently has OCR packages, according to SourceForge. How
can it be that Java does not ?

Seofara
 
Reply With Quote
 
 
 
 
Marco Schmidt
Guest
Posts: n/a
 
      09-18-2003
Soefara:

>Is it just me or no Java OCR package exists ?


Good OCR is hard and requires a lot of research and experience.
Finereader has an SDK that works under Windows and Linux:
<http://www.abbyy.com/developer_toolkits1.asp?param=28807&from=topcom2>.
Maybe it can be interfaced from Java?

Regards,
Marco
--
Please reply in the newsgroup, not by email!
Java programming tips: http://jiu.sourceforge.net/javatips.html
Other Java pages: http://www.geocities.com/marcoschmidt.geo/java.html
 
Reply With Quote
 
 
 
 
Soefara
Guest
Posts: n/a
 
      09-20-2003
Thank you for the reply Marco.

> Good OCR is hard and requires a lot of research and experience.
> Finereader has an SDK that works under Windows and Linux:
> <http://www.abbyy.com/developer_toolkits1.asp?param=28807&from=topcom2>.
> Maybe it can be interfaced from Java?


I'm not sure how I'd go about "interfacing" that. However,
there do seem to be quite a few open source and linux OCR
packages, some of which can be driven from the command line,
the most prominent of which is Clara
(see http://www.claraocr.org/faq.html)

Is there any danger in executing an external program (such
as Clara) from within a Java servlet using something like this ?

Runtime.exec("/full/path/to/program [optional-arguments]");


Soefara
 
Reply With Quote
 
Ayesh Ayesh is offline
Junior Member
Join Date: May 2008
Posts: 1
 
      05-23-2008
Greetings,

I know that it's quite a long time that those posts are here but I found them while looking for an OCR solution in Java, and I would like to share the FREE answer I have created.

I browsed lots of posts while searching for OCR in Java, and all was linking to Asprise / javaocr, but those are unaffordable for non-commercial project.

So I searched for OCR software, without language prereq, in the purpose to interface it with Java.

-I discovered GOCR (http://jocr.sourceforge.net/) which is an ocr in command line. It was a beginning ^^ I downloaded and used Windows version. After few tests I was able to figure how to use it but I've to feed it with PPM images.

-Here come the second software nconvert (http://pagesperso-orange.fr/pierre.g..._nconvert.html) which can convert images to PPM.

So I have done 2 static classes to act like OCR.

The main part is the class OCR, which take a screenshot of the screen, put the proper color (I've made gorc work only with Black letters on White background), write the image to the disk and then call nconvert and gorc.

By parsing outputstream of GOCR process you should have your text recognized. There is the "replace" thing in return because I work on numbers and gorc make some mistakes with 1-l and O-0 ^^

That's not a Strong OCR facility but it can help with little application. Hope it'll help and lot of thanks to nconvert and gocr

Code:
package t3x.tnn.utility;

import java.awt.Color;
import java.awt.Point;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

import javax.imageio.ImageIO;

public class OCR {
	static public String recognize(Point hg, Point bd, Color color, boolean isColorEcriture){
		String res = null;
		File fImg = new File("screenshot.png");
		while(res == null){
			BufferedImage img = ScreenHandler.getScreen(hg, bd);
			if(isColorEcriture)
				img = changeWithColorEcriture(img, color);
			else
				img = changeWithColorFond(img, color);
			try {
				ImageIO.write(img, "PNG", fImg);
				Process p = Runtime.getRuntime().exec("nconvert -out ppm -o text.ppm screenshot.png");
				p.waitFor();
				p.destroy();
				p = Runtime.getRuntime().exec("gocr045 text.ppm");
				p.waitFor();
				if(p.getInputStream().available()>0)
					res = IOHandler.getResponse(p.getInputStream());
				p.destroy();
			}catch (InterruptedException e) {
				e.printStackTrace();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}
		if(fImg.exists())
			fImg.delete();
		File texte = new File("text.ppm");
		if(texte.exists())
			texte.delete();
		return res.replace("l", "1").replace("O", "0").trim();
	}

	private static BufferedImage changeWithColorEcriture(BufferedImage bi, Color ecriture) {
		if (bi != null) {                       
			int w = bi.getWidth();
			int h = bi.getHeight();
			int pixel;
			BufferedImage bitmp = new BufferedImage(w, h, bi.getType());
			BufferedImage biOut = new BufferedImage(w, h, bi.getType());

			for (int x = 0; x < w; x++) {
				for (int y = 0; y < h; y++) {
					pixel = bi.getRGB(x, y);
					if(pixel != ecriture.getRGB())
						pixel = Color.BLUE.getRGB();
					else
						pixel = Color.BLACK.getRGB();
					bitmp.setRGB(x, y, pixel); 
				}
			}

			for (int x = 0; x < w; x++) {
				for (int y = 0; y < h; y++) {
					pixel = bitmp.getRGB(x, y);
					if(pixel == Color.BLUE.getRGB())
						pixel = Color.WHITE.getRGB();
					biOut.setRGB(x, y, pixel);
				}
			}

			return biOut;
		} else {
			return bi;
		}
	}
	
	private static BufferedImage changeWithColorFond(BufferedImage bi, Color fond) {
		if (bi != null) {                       
			int w = bi.getWidth();
			int h = bi.getHeight();
			int pixel;
			BufferedImage bitmp = new BufferedImage(w, h, bi.getType());
			BufferedImage biOut = new BufferedImage(w, h, bi.getType());

			for (int x = 0; x < w; x++) {
				for (int y = 0; y < h; y++) {
					pixel = bi.getRGB(x, y);
					if(pixel == fond.getRGB())
						pixel = Color.BLUE.getRGB();
					else
						pixel = Color.WHITE.getRGB();
					bitmp.setRGB(x, y, pixel); 
				}
			}

			for (int x = 0; x < w; x++) {
				for (int y = 0; y < h; y++) {
					pixel = bitmp.getRGB(x, y);
					if(pixel == Color.BLUE.getRGB())
						pixel = Color.WHITE.getRGB();
					biOut.setRGB(x, y, pixel);
				}
			}

			return biOut;
		} else {
			return bi;
		}
	}
}
Code:
package t3x.tnn.utility;

import java.awt.AWTException;
import java.awt.Color;
import java.awt.Dimension;
import java.awt.Point;
import java.awt.Rectangle;
import java.awt.Robot;
import java.awt.image.BufferedImage;

public class ScreenHandler {

	public static Color getPixelColor(Point p){
		return getPixelColor(p.x, p.y);
	}

	public static BufferedImage getScreen(Point hg, Point bd){
		checkNano();
		return nano.createScreenCapture(new Rectangle(hg, new Dimension(bd.x-hg.x, bd.y-hg.y)));
	}
	
	public static boolean areImagesEqual(BufferedImage img1, BufferedImage img2){
		int[] timg1 = getPixels(img1);
		int[] timg2 = getPixels(img2);
		for(int i = 0 ; i < timg1.length; i++){
			if(timg1[i]!=timg2[i]){
				return false;
			}
		}
		return true;
	}
	
	public static Color analyse(Point depart, int deviation, Color fond){
		for(int i= depart.x; i < depart.x+deviation; i++){
			Color col = ScreenHandler.getPixelColor(i, depart.y);
			if(!col.equals(fond))
				return col;
		}
		//IOHandler.abort("[ScreenHandler.analyse] : Aucune couleur de jeu trouvée");
		return null;
	}
///////////////////////////////////////////////////////////////////////////////////
	private static Robot nano;
	
	private static Color getPixelColor(int x, int y){
		checkNano();
		return nano.getPixelColor(x, y);
	}
	
	private static int[] getPixels(BufferedImage img){
		return img.getRaster().getPixels(img.getRaster().getMinX(), img.getRaster().getMinY(),  img.getRaster().getWidth(), img.getRaster().getHeight(), new int[ img.getRaster().getWidth()*img.getRaster().getHeight()*10]);
	}
	
	private static void checkNano(){
		if(nano == null)
			try {
				nano = new Robot();
			} catch (AWTException e) {
				e.printStackTrace();
			}
	}
}
 
Reply With Quote
 
eduardoavdr eduardoavdr is offline
Junior Member
Join Date: Dec 2008
Posts: 1
 
      12-15-2008
Hi Ayesh,

We´ve developed a web app which indexes documents, you can see it at nootes dot org

What we want now is to make a swing app which lets me scan documents and do OCR on them so they can be uploaded to my web app using webservices (already developed).

The thing is that I found your solution perfect to my needs, but when I tried to use it on NetBeans IDE I got the following error:

res = IOHandler.getResponse(p.getInputStream());

What package do I need to use such function?

Thanks for your help and your time.
 

Last edited by eduardoavdr; 12-16-2008 at 05:23 PM..
Reply With Quote
 
clueless clueless is offline
Junior Member
Join Date: Aug 2009
Posts: 1
 
      08-02-2009
I am aware that this thread is rather old but am in need of help! I have used java quite extansivly a few years back but unfortunatly am a little rusty with it- i am trying to make an OCR program and think that the method posted here using gocr and nconvert is a good idea to avoid using Aspire OCR which needs payed for...

Anyway, using blueJ, I am having the same problem as the above poster "cannot find symbol - variable IOHandler". I thought I would try it in netbeans too just to make sure it wasn't a blueJ quirk but same error message.

From waht I can gather the variable IOHandler hasn't been defined in the OCR class but I am unsure what type to variable to declare it as such that it can use the getResponse() method. does anyone have any idea?

I have searched high and low to find a solution but to no avail, I really hope someone can point me in the right direction.

Thanks.

Ewen
 
Reply With Quote
 
sherazam sherazam is offline
Senior Member
Join Date: Aug 2010
Posts: 183
 
      01-20-2012
I am aware this post is a bit old but I am sure it will help some other guys searching for this topic. Here is a new stable OCR library from Aspose.

Aspose.OCR is a Java optical character recognition component built to allow developers to add OCR functionality in their Java web applications, web services and Windows applications. Aspose.OCR provides simple set of classes that allow developers to recognize characters from the images. You can extract text in hOCR format which not only gives you the extracted text but other text information (font and style etc.) as well. Aspose.OCR for Java supports Arial, Times New Roman and Tahoma fonts in regular, bold and italic text styles. The API is extensible, easy to use and compact. It provides common functionality so that developers have to write less code when performing common tasks.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Java OCR Sprashant Java 0 05-27-2011 10:00 AM
Java + Ocr Sdk maciejzapior Java 0 08-06-2009 11:28 AM
Java OCR jazz_2k2 Java 0 03-05-2008 11:48 AM
doing some ocr stuff with java mambenanje@gmail.com Java 3 08-27-2006 11:36 AM
Urgent! OCR SDK and API or OCR COM integration help Anjali Lourda ASP .Net 0 02-04-2004 06:52 PM



Advertisments