Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Newbie: working with a text file and converting to xml

Reply
Thread Tools

Newbie: working with a text file and converting to xml

 
 
Adam Teale
Guest
Posts: n/a
 
      12-06-2006
hi Guys,

I have a tab-delimited text file that I would like to convert into an
xml file that can be read/imported into Apple's Final Cut Pro.

The text file is 2 columns.
The first column is the time (timecode)
The second column is text (for sub-titling)

I thought this might be a good starting project to get into Ruby

Any suggestions on how I might approach this?

Thanks!

Adam Teale

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Kevin Jackson
Guest
Posts: n/a
 
      12-06-2006
> I have a tab-delimited text file that I would like to convert into an
> xml file that can be read/imported into Apple's Final Cut Pro.
>
> The text file is 2 columns.
> The first column is the time (timecode)
> The second column is text (for sub-titling)
>
> I thought this might be a good starting project to get into Ruby
>
> Any suggestions on how I might approach this?


look at XMLBuilder and FasterCSV

Setup FasterCSV to use a tab as the delimiter instead of the comma and
then use it to read the input and then use XMLBuilder to output
<timecode>data</timecode><sub-title>data</subtitle>

should be fairly simple, or you can avoid libraries and do it by
yourself to learn more about ruby without getting bogged down in 3rd
party libs

x = Builder::XmlMarkup.new(:target => $stdout, :indent => 1)
x.instruct
x.timcode data
x.sub-title data

etc

Kev

 
Reply With Quote
 
 
 
 
Peter Szinek
Guest
Posts: n/a
 
      12-06-2006
Adam Teale wrote:
> hi Guys,
>
> I have a tab-delimited text file that I would like to convert into an
> xml file that can be read/imported into Apple's Final Cut Pro.
>
> The text file is 2 columns.
> The first column is the time (timecode)
> The second column is text (for sub-titling)


Could you send us 2 example files? I guess the text file format is
obvious (but better to work with a real-life example) but I am not so
sure about the Final Cut Pro XML (or is it just a plain simple XML?)

Until then, check out this code:

================================================== ==========
input = <<INPUT
0.12 Salut, Foo!
0.15 Hola Bar! Did you see Baz?
0.22 I guess he is hanging around with Fluff and Ork.
INPUT

template = <<TEMPLATE
<timecode>TIMECODE</timecode>
<sub-titling>SUB-TITLING</sub-titling>
TEMPLATE

result = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"

input.split(/\n/).each do |line|
data = line.split(/\t/)
result += template.sub('TIMECODE'){data[0]}.sub('SUB-TITLING'){data[1]}
end

result += '</xml>'

puts result
================================================== ==========

output:

<?xml version="1.0" encoding="ISO-8859-1"?>
<timecode>0.12</timecode>
<sub-titling>Salut, Foo!</sub-titling>
<timecode>0.15</timecode>
<sub-titling>Hola Bar! Did you see Baz?</sub-titling>
<timecode>0.22</timecode>
<sub-titling>I guess he is hanging around with Fluff and
Ork.</sub-titling>
</xml>


Cheers,
Peter

__
http://www.rubyrailways.com


 
Reply With Quote
 
Adam Teale
Guest
Posts: n/a
 
      12-06-2006
Hi Kev & Peter!

Thanks for respoding so quickly!

The text file looks pretty much like that

00:00:30:13 Swayambhunath Temple: building started 460AD
00:00:42:21 Durbar Square
00:01:05:06 Driving to Trisuli River for Rafting
00:01:55:22 Day 1 Trekking: Pokhara to Tirkhedhunga (1540m)
00:02:20:20 Day 2 Trekking: Tirkhedhunga to Ghorephani (2750m)
00:02:33:19 Day 3 Trekking: Ghorephani to Ghandruk (1940m)
00:02:42:04 Day 4 Trekking: Ghandruk to Pothana (1900m)
00:03:10:13 Day 5 Trekking: Pothana to Phedi (1130m)

It'll take a while for your example to filter down into my brain - when
it does I'll get back to you about it.

Awesome!

Thanykou so much!

Adam


Peter Szinek wrote:
> Adam Teale wrote:
>> hi Guys,
>>
>> I have a tab-delimited text file that I would like to convert into an
>> xml file that can be read/imported into Apple's Final Cut Pro.
>>
>> The text file is 2 columns.
>> The first column is the time (timecode)
>> The second column is text (for sub-titling)

>
> Could you send us 2 example files? I guess the text file format is
> obvious (but better to work with a real-life example) but I am not so
> sure about the Final Cut Pro XML (or is it just a plain simple XML?)
>
> Until then, check out this code:
>
> ================================================== ==========
> input = <<INPUT
> 0.12 Salut, Foo!
> 0.15 Hola Bar! Did you see Baz?
> 0.22 I guess he is hanging around with Fluff and Ork.
> INPUT
>
> template = <<TEMPLATE
> <timecode>TIMECODE</timecode>
> <sub-titling>SUB-TITLING</sub-titling>
> TEMPLATE
>
> result = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"
>
> input.split(/\n/).each do |line|
> data = line.split(/\t/)
> result +=
> template.sub('TIMECODE'){data[0]}.sub('SUB-TITLING'){data[1]}
> end
>
> result += '</xml>'
>
> puts result
> ================================================== ==========
>
> output:
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <timecode>0.12</timecode>
> <sub-titling>Salut, Foo!</sub-titling>
> <timecode>0.15</timecode>
> <sub-titling>Hola Bar! Did you see Baz?</sub-titling>
> <timecode>0.22</timecode>
> <sub-titling>I guess he is hanging around with Fluff and
> Ork.</sub-titling>
> </xml>
>
>
> Cheers,
> Peter
>
> __
> http://www.rubyrailways.com



--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Peter Szinek
Guest
Posts: n/a
 
      12-06-2006
Adam Teale wrote:
> The text file looks pretty much like that


Then it should be fine - as far as there are no tabs in the second
column. Of course even that would not mean an unsolvable problem but it
would not work with the code I sent you.

> It'll take a while for your example to filter down into my brain - when
> it does I'll get back to you about it.


Sure!

>
> Awesome!

Yeah, Ruby is awesome! I am a beginner, too (picked up Ruby a few months
ago) and though I have very limited time to learn it, I can do a lot of
things already. The learning curve is really steep.

Cheers,
Peter

__
http://www.rubyrailways.com

 
Reply With Quote
 
Adam Teale
Guest
Posts: n/a
 
      12-06-2006
Hi Peter,

I saved your code and called it convert.rb. I ran it (replacing
'filename' with the path of my text file - was that right to do?)

i got this error:
convert.rb:1: unknown regexp options - atal

any ideas?

also, do you know if thereis any way to run a script from the
commandline like?:
/convert.rb mytextfile.txt
i made a shell script that used this kind of thing - it took the input
file as something like $ARGV (i think - sorry i'm a super newbie!!)
make sense?

Thanks Peter!

Adam


Peter Szinek wrote:
> Adam Teale wrote:
>> The text file looks pretty much like that

>
> Then it should be fine - as far as there are no tabs in the second
> column. Of course even that would not mean an unsolvable problem but it
> would not work with the code I sent you.
>
>> It'll take a while for your example to filter down into my brain - when
>> it does I'll get back to you about it.

>
> Sure!
>
>>
>> Awesome!

> Yeah, Ruby is awesome! I am a beginner, too (picked up Ruby a few months
> ago) and though I have very limited time to learn it, I can do a lot of
> things already. The learning curve is really steep.
>
> Cheers,
> Peter
>
> __
> http://www.rubyrailways.com



--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Peter Szinek
Guest
Posts: n/a
 
      12-06-2006
Adam Teale wrote:
> Hi Peter,
>
> I saved your code and called it convert.rb. I ran it (replacing
> 'filename' with the path of my text file - was that right to do?)
>
> i got this error:
> convert.rb:1: unknown regexp options - atal
>
> any ideas?

I guess you are referring to Paul's solution since I did not use any
files In any case, could you paste the code here (convert.rb) so I
can check what's going on?

> also, do you know if thereis any way to run a script from the
> commandline like?:
> ./convert.rb mytextfile.txt


Sure. The array called ARGV contains all the command line options.

------ test.rb
#!/usr/bin/ruby
puts ARGV[0]
puts ARGV[1]
------

/test rb foo bar

will output

----
foo
bar
----

Cheers,
Peter

__
http://www.rubyrailways.com

 
Reply With Quote
 
Adam Teale
Guest
Posts: n/a
 
      12-06-2006
doh! Sorry guys!

Peter - thanks for the ARGV tips!

I think i have Paul's script going using the ARGV
---------------------------------------------------
#!/usr/bin/ruby -w

data = File.read(ARGV[0])

output = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"

data.each do |line|
timecode,subtitle = line.strip.split("\t")
xml =
"<item><timecode>#{timecode}</timecode><subtitle>#{subtitle}</subtitle></item>"
output += xml + "\n"
end

File.open("output.xml","w") { |f| f.write output }
---------------------------------------------------


However it only outputs the first line from my txt file:
---------------------------------------------------
<?xml version="1.0" encoding="ISO-8859-1"?>
<item><timecode>00:00:30:13</timecode><subtitle>Swayambhunath Temple:
building started 460AD
00:00:42:21</subtitle></item>
---------------------------------------------------

Apologies for my newbieness!

Cheers guys!

Adam




--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Peter Szinek
Guest
Posts: n/a
 
      12-06-2006
Hi,
>
> However it only outputs the first line from my txt file:
> ---------------------------------------------------
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <item><timecode>00:00:30:13</timecode><subtitle>Swayambhunath Temple:
> building started 460AD
> 00:00:42:21</subtitle></item>
> ---------------------------------------------------

Hmm strange. I have cut'n'pasted this code and the data from your
previous mail and
for me it works perfectly (as all other Paul's solutions). Are you sure your
input txt file is OK?

Are you on Mac? Maybe there can be something with the line breaks?

> Apologies for my newbieness!

No need to apologize. In no time, *you* will be answering other's
questions

Peter

__
http://www.rubyrailways.com

 
Reply With Quote
 
Adam Teale
Guest
Posts: n/a
 
      12-06-2006
Ah thanks Peter - yes on OSX - you are right, there is something funny
with the line breaks! Weird!

now i just have to work out how to add all the FCP xml stuff in there

I appreciate l all your help & encouraging words!!



Peter Szinek wrote:
> Hi,
>>
>> However it only outputs the first line from my txt file:
>> ---------------------------------------------------
>> <?xml version="1.0" encoding="ISO-8859-1"?>
>> <item><timecode>00:00:30:13</timecode><subtitle>Swayambhunath Temple:
>> building started 460AD
>> 00:00:42:21</subtitle></item>
>> ---------------------------------------------------

> Hmm strange. I have cut'n'pasted this code and the data from your
> previous mail and
> for me it works perfectly (as all other Paul's solutions). Are you sure
> your
> input txt file is OK?
>
> Are you on Mac? Maybe there can be something with the line breaks?
>
>> Apologies for my newbieness!

> No need to apologize. In no time, *you* will be answering other's
> questions
>
> Peter
>
> __
> http://www.rubyrailways.com



--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
ElementTree.XML(string XML) and ElementTree.fromstring(string XML)not working Kee Nethery Python 12 06-27-2009 06:06 AM
Converting xls or plain text file to xml sweetpotatop@yahoo.com XML 2 07-03-2008 05:08 PM
converting a text file into an "insert into ..." file kublaikhan55@hotmail.com Ruby 5 07-23-2006 07:35 PM
Problem to insert an XML-element by XSLT-converting from one XML-file into another XML-file jkflens XML 2 05-30-2006 09:41 AM
Different results parsing a XML file with XML::Simple (XML::Sax vs. XML::Parser) Erik Wasser Perl Misc 5 03-05-2006 10:09 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57