Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > How to convert MS doc to plain text using Perl on unix

Reply
Thread Tools

How to convert MS doc to plain text using Perl on unix

 
 
Diandian Zhang
Guest
Posts: n/a
 
      01-14-2005
Does anyone have an idea, how to do this? Thanks!
 
Reply With Quote
 
 
 
 
Arndt Jonasson
Guest
Posts: n/a
 
      01-14-2005

http://www.velocityreviews.com/forums/(E-Mail Removed) (Diandian Zhang) writes:
> Does anyone have an idea, how to do this? Thanks!


There are some module on CPAN dealing with RTF (rich text format).
Maybe they are useful.
 
Reply With Quote
 
 
 
 
Tintin
Guest
Posts: n/a
 
      01-14-2005

"Diandian Zhang" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) om...
> Does anyone have an idea, how to do this? Thanks!


Save as a text file in Open Office


 
Reply With Quote
 
Stephen Patterson
Guest
Posts: n/a
 
      01-15-2005
Diandian Zhang wrote:
> Does anyone have an idea, how to do this? Thanks!


If you're on windows and have word, Win32::OLE
 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      01-15-2005
Stephen Patterson <(E-Mail Removed)> wrote in news:34t8d9F4ebb01U1
@individual.net:

> Diandian Zhang wrote:
>> Does anyone have an idea, how to do this? Thanks!

>
> If you're on windows and have word, Win32::OLE


Once again, the perils of putting your entire question in the subject
line are demonstrated.

The OP needs this on Unix.

One alternative is to take a look at word2x (google for it).

On the other hand, if all one wants to is, say, to index contents of a
Word file, the following would work to a certain extent:

#! /usr/bin/perl

use strict;
use warnings;

use File::Slurp;

my $word_file = shift;
my $doc = read_file($word_file, binmode => ':raw');

$doc =~ s/[^\015\012\011\040-\176]//g;
write_file(\*STDOUT, $doc);

__END__

Sinan
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
How to convert markup text to plain text in python? geoffbache Python 8 02-11-2008 10:02 AM
I need to convert MSWord ".doc" files to plain data ".rtf" ones . . . lbrtchx@gmail.com Java 13 01-01-2008 06:51 PM
Best way to convert html to plain text in java? google@lrlart.com Java 7 07-04-2006 06:29 AM
String[] files = {"a.doc, b.doc"}; VERSUS String[] files = new String[] {"a.doc, b.doc"}; Matt Java 3 09-17-2004 10:28 PM



Advertisments