Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Converting Word Documents (and other types of files) to PDF in a rails application

Reply
Thread Tools

Converting Word Documents (and other types of files) to PDF in a rails application

 
 
Brendon
Guest
Posts: n/a
 
      02-13-2007
Hi everyone,

I'm currently developing a Rails content management system, and one of
the useful features we're trying to get sorted is a way to allow a
user to upload a word document or some other type of regular file, and
have it converted by the server to PDF.

I've seen a java programme called Alfresco do this using Open Office,
but I'm just wondering if there's another way that's more integrated
with Ruby?

If anyone knows if bits and pieces of tech already exist in this area
but they just haven't been brought together in this way yet, I'd be
keen to learn the names of these projects. Perhaps we can create a
plugin as a result of our larger project.

Looking forward to your ideas

Cheers,

Brendon

 
Reply With Quote
 
 
 
 
M. Edward (Ed) Borasky
Guest
Posts: n/a
 
      02-13-2007
Brendon wrote:
> Hi everyone,
>
> I'm currently developing a Rails content management system, and one of
> the useful features we're trying to get sorted is a way to allow a
> user to upload a word document or some other type of regular file, and
> have it converted by the server to PDF.
>
> I've seen a java programme called Alfresco do this using Open Office,
> but I'm just wondering if there's another way that's more integrated
> with Ruby?
>
> If anyone knows if bits and pieces of tech already exist in this area
> but they just haven't been brought together in this way yet, I'd be
> keen to learn the names of these projects. Perhaps we can create a
> plugin as a result of our larger project.
>
> Looking forward to your ideas
>
> Cheers,
>
> Brendon
>

There are open source things that kinda sorta almost work, but they're
ephemeral by nature -- Microsoft proprietary formats are ... well ...
proprietary. Abiword and OpenOffice.org can read most Word documents,
and lots of tools can write PDFs. Here's a list of things I have on Gentoo:



* dev-python/rtf2xml
Latest version available: 1.32
Latest version installed: [ Not Installed ]
Unstable version: 1.32
Use Flags (stable): -
Size of downloaded files: 1,255 kB
Homepage: http://rtf2xml.sourceforge.net/
Description: Converts a Microsoft RTF file to structured XML
License: GPL-2

* dev-java/poi
Latest version available: 2.5.1-r1
Latest version installed: [ Not Installed ]
Unstable version: 2.5.1-r1
Use Flags (stable): -doc -elibc_FreeBSD -elibc_FreeBSD +source
Size of downloaded files: 20,128 kB
Homepage: http://jakarta.apache.org/poi/
Description: Java API To Access Microsoft Format Files
License: Apache-2.0

* app-text/catdoc
Latest version available: 0.94.1
Latest version installed: [ Not Installed ]
Unstable version: 0.94.1
Use Flags (stable): +tk
Size of downloaded files: 490 kB
Homepage: http://www.45.free.net/~vitus/software/catdoc/
Description: A convertor for Microsoft Word, Excel and RTF Files
to text
License: GPL-2

* app-text/docfrac
Latest version available: 3.1.1
Latest version installed: [ Not Installed ]
Unstable version: 3.1.1
Use Flags (stable): -
Size of downloaded files: 7,105 kB
Homepage: http://docfrac.sourceforge.net/
Description: rtf/html/text conversion utility
License: LGPL-2.1

* app-text/highlight
Latest version available: 2.4.8
Latest version installed: [ Not Installed ]
Unstable version: 2.4.8
Use Flags (stable): -
Size of downloaded files: 925 kB
Homepage: http://www.andre-simon.de/
Description: converts source code to formatted text ((X)HTML, RTF,
(La)TeX, XSL-FO, XML) with syntax highlighting.
License: GPL-2

* app-text/unrtf
Latest version available: 0.20.1
Latest version installed: [ Not Installed ]
Unstable version: 0.20.1
Use Flags (stable): -
Size of downloaded files: 448 kB
Homepage: http://www.gnu.org/software/unrtf/unrtf.html
Description: Converts RTF files to various formats
License: GPL-2

* app-text/wv
Latest version available: 1.2.3-r1
Latest version installed: 1.2.3-r1
Unstable version: 1.2.3-r1
Use Flags (stable): +wmf
Size of downloaded files: 1,844 kB
Homepage: http://wvware.sourceforge.net/
Description: Tool for conversion of MSWord doc and rtf files to
something readable
License: GPL-2

* dev-python/pyrtf
Latest version available: 0.45
Latest version installed: [ Not Installed ]
Unstable version: 0.45
Use Flags (stable): -
Size of downloaded files: 96 kB
Homepage: http://pyrtf.sourceforge.net
Description: A set of python classes that make it possible to
produce RTF documents from python programs.
License: || ( GPL-2 LGPL-2 )

* dev-tex/latex2rtf
Latest version available: 1.9.16
Latest version installed: [ Not Installed ]
Unstable version: 1.9.16
Use Flags (stable): -doc
Size of downloaded files: 1,856 kB
Homepage: http://latex2rtf.sourceforge.net/
Description: LaTeX to RTF converter
License: GPL-2

I think "wv" is the best of this crowd, but it's been a while since I
had to eat a Word document on my Linux box. As far as Word-to-PDF is
concerned, it's more or less built in to OpenOffice.org, so just
shelling out to that might get the job done. OO.o is quite a memory hog,
though, so you'll probably want to fire it up as a server and leave it
running.






>
>
>



--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.


 
Reply With Quote
 
 
 
 
ara.t.howard@noaa.gov
Guest
Posts: n/a
 
      02-13-2007
On Tue, 13 Feb 2007, Brendon wrote:

> Hi everyone,
>
> I'm currently developing a Rails content management system, and one of
> the useful features we're trying to get sorted is a way to allow a
> user to upload a word document or some other type of regular file, and
> have it converted by the server to PDF.
>
> I've seen a java programme called Alfresco do this using Open Office,
> but I'm just wondering if there's another way that's more integrated
> with Ruby?
>
> If anyone knows if bits and pieces of tech already exist in this area
> but they just haven't been brought together in this way yet, I'd be
> keen to learn the names of these projects. Perhaps we can create a
> plugin as a result of our larger project.
>
> Looking forward to your ideas
>
> Cheers,
>
> Brendon


i'd reccomend one of three approaches

1) run windows under cross-over office on a linux box or under parallels on
osx. then you can install ruby, under windows, and write a ruby application
that does this using the native ms bindings

2) setup a windows servers (two would be better) and run a simple drb
service for converting doccuments - again using the native drivers and ruby

3) use some online service such as http://www.zamzar.com/ and drive it via
http libs/curl

if it were me i'd go with 3 since any reliable method will ential keeping a
windows box online (!!). you could could probably write producer code that
posted files to a service and a consumer that checks a special email account
for finished files in a few days...

regards.

-a
--
we can deny everything, except that we have the possibility of being better.
simply reflect on that.
- the dalai lama

 
Reply With Quote
 
Brendon
Guest
Posts: n/a
 
      02-13-2007
Thank you both for your informative replys. I like the idea of an
online service. As I was reading your first two options I quickly came
to your conclusion !

I'll post any learnings and methods on here when I have something
working,

Cheers,

Brendon

 
Reply With Quote
 
Chris Carter
Guest
Posts: n/a
 
      02-13-2007
On 2/12/07, Brendon <(E-Mail Removed)> wrote:
> Hi everyone,
>
> I'm currently developing a Rails content management system, and one of
> the useful features we're trying to get sorted is a way to allow a
> user to upload a word document or some other type of regular file, and
> have it converted by the server to PDF.
>
> I've seen a java programme called Alfresco do this using Open Office,
> but I'm just wondering if there's another way that's more integrated
> with Ruby?
>
> If anyone knows if bits and pieces of tech already exist in this area
> but they just haven't been brought together in this way yet, I'd be
> keen to learn the names of these projects. Perhaps we can create a
> plugin as a result of our larger project.
>
> Looking forward to your ideas
>
> Cheers,
>
> Brendon
>
>
>

I think the best way might be to use RJB or JRuby and Jakarta
Poi(http://jakarta.apache.org/poi/)

--
Chris Carter
concentrationstudios.com
brynmawrcs.com

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Render Floating Tables While Converting Documents to PDF & XPS sherazam Java 0 07-01-2011 05:05 PM
Postscript to PDF with pdf-tools, pdf-writer, or other Sean Nakasone Ruby 1 04-14-2008 09:13 PM
PDF::Writer, create pdf and insert in other pdf file. Ricardo Pog Ruby 1 03-26-2008 08:24 PM
searching through MS Word and PDF documents Darrel ASP .Net 1 02-18-2005 08:43 PM
Converting HTML pages to Word documents Rohit Gupta HTML 1 02-12-2004 07:33 AM



Advertisments