Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Parsing text

Reply
Thread Tools

Parsing text

 
 
Cyril Jose
Guest
Posts: n/a
 
      04-22-2011
Hey all,

I have a file where I need to parse information from. The format of the
first line is something like this:

">ruby ruby |ruby|ruby ruby|text_i_want| test test"

I was thinking converting this line into an array, using the .split(//)
and keeping count of the pipe("|") character so that when it reaches the
3rd one, it reads the characters up till the 4th pipe(all in a do
iterator. So in essence, I would want to extract "text_i_want". When i
tried this method, I got stuck. Any ideas on how to move forward? Or an
easier solution than this? Thanks!

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
John W Higgins
Guest
Posts: n/a
 
      04-22-2011
[Note: parts of this message were removed to make it a legal post.]

Good Afternoon,

On Thu, Apr 21, 2011 at 6:43 PM, Cyril Jose <(E-Mail Removed)> wrote:

> Hey all,
>
> I have a file where I need to parse information from. The format of the
> first line is something like this:
>
> ">ruby ruby |ruby|ruby ruby|text_i_want| test test"
>
> I was thinking converting this line into an array, using the .split(//)
>


You got close - this should work for you

split(/\|/)[3]

That will return the 4th group of text for you

John

 
Reply With Quote
 
 
 
 
7stud --
Guest
Posts: n/a
 
      04-22-2011
A pipe is one of the special regex characters--it does not stand for a
literal pipe. A pipe is used in a regex to mean 'OR'.

There several other ways to escape the special regex characters, so that
they will lose their special meaning and match themselves:

1) You can use a backslash to escape the pipe.

2) You can put the pipe in a character class:

str = ">ruby ruby |ruby|ruby ruby|text_i_want| test test"

pieces = str.split(/[|]/)
puts pieces[3]

--output:--
text_i_want

3) You can call Regexp.escape to escape any special regex characters
contained in the string, so that they lose their special meaning:

str = ">ruby ruby |ruby|ruby ruby|text_i_want| test test"

pattern = "|"
esc_str = Regexp.escape(pattern)

pieces = str.split(/#{esc_str}/)
puts pieces[3]

--output:--
text_i_want

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Cyril Jose
Guest
Posts: n/a
 
      04-22-2011
Thanks John and 7stud - I have a better understanding now.

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
SAX parsing problem, when element contains text like "[text]" Kai Schlamp Java 1 03-27-2008 08:36 PM
[ANN] Parsing Tutorial and YARD 1.0: A C++ Parsing Framework Christopher Diggins C++ 0 07-09-2007 09:01 PM
Assistance parsing text file using Text::CSV_XS Domenico Discepola Perl Misc 6 09-02-2004 03:55 PM
SAX Parsing - Weird results when parsing content between tags. Naren XML 0 05-11-2004 07:25 PM
Perl expression for parsing CSV (ignoring parsing commas when in double quotes) GIMME Perl 2 02-11-2004 05:40 PM



Advertisments