Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Splitting and keeping key/value

Reply
Thread Tools

Splitting and keeping key/value

 
 
Sandman
Guest
Posts: n/a
 
      09-26-2006
Indata:
-------------------------------------
Date: 2006-04-03
Message: Wonderful! Let's
meet there! I'll call you
later
Sent by: John
-------------------------------------

I want this parsed into:

Array (
[Date] => "2006-04-03",
[Message] => "Wonderful! Let's\nmeet there! ....."
[Sent by] => "John"
)

By defining keywords that data should be split in, in this case
"Date", "Message", "Sent by" and that those should be the first word
on the line and they should be followed by a ":". The Message part in
my actual indata is at no risk of containing any of these keywords.

Any cute ideas on how to solve that? Thanks in advance.


--
Sandman[.net]
 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      09-26-2006
Sandman wrote:
> Indata:
> -------------------------------------
> Date: 2006-04-03
> Message: Wonderful! Let's
> meet there! I'll call you
> later
> Sent by: John
> -------------------------------------


Please speak Perl, not some bizarre pseudo-code. Do you mean:
my $Indata = "Date: 2006-04-03
Message: Wonderful! Let's
meet there! I'll call you
later
Sent by: John";

or do you mean:
my @Indata = (
"Date: 2006-04-03\n",
"Message: Wonderful! Let's\n",
"meet there! I'll call you\n",
"later\n",
"Sent by: John\n"
);

?

The difference is important.


> I want this parsed into:
>
> Array (
> [Date] => "2006-04-03",
> [Message] => "Wonderful! Let's\nmeet there! ....."
> [Sent by] => "John"
> )


Is this some sort of pseudo-PHP? Are you aware you posted to a Perl
newsgroup? Do you mean you want:

my %hash = (
'Date' => '2006-04-03',
'Message' => "Wonderful! Let's\nmeet there! ...",
'Sent by' => John',
);

?


> By defining keywords that data should be split in, in this case
> "Date", "Message", "Sent by" and that those should be the first word
> on the line and they should be followed by a ":". The Message part in
> my actual indata is at no risk of containing any of these keywords.
>
> Any cute ideas on how to solve that?


I don't how cute it is, but yes, I could solve that using regular
expressions. Have you made any attempts to solve it yourself yet? If
you post your best attempt, and describe how that attempt is not
working for you, we can probably help you fix it.

Paul Lalli

 
Reply With Quote
 
 
 
 
Sandman
Guest
Posts: n/a
 
      09-26-2006
In article <(E-Mail Removed) om>,
"Paul Lalli" <(E-Mail Removed)> wrote:

> Sandman wrote:
> > Indata:
> > -------------------------------------
> > Date: 2006-04-03
> > Message: Wonderful! Let's
> > meet there! I'll call you
> > later
> > Sent by: John
> > -------------------------------------

>
> Please speak Perl, not some bizarre pseudo-code. Do you mean:
> my $Indata = "Date: 2006-04-03
> Message: Wonderful! Let's
> meet there! I'll call you
> later
> Sent by: John";
>
> or do you mean:
> my @Indata = (
> "Date: 2006-04-03\n",
> "Message: Wonderful! Let's\n",
> "meet there! I'll call you\n",
> "later\n",
> "Sent by: John\n"
> );
>
> ?
>
> The difference is important.


My indata is a textfile. Sorry.

> > I want this parsed into:
> >
> > Array (
> > [Date] => "2006-04-03",
> > [Message] => "Wonderful! Let's\nmeet there! ....."
> > [Sent by] => "John"
> > )

>
> Is this some sort of pseudo-PHP?


No.

> Are you aware you posted to a Perl newsgroup?


Yes. Did you or did you not understand the array composition I was
looking for? If you didn't, I would be glad to explain it further as
to avoid confusion.

> > By defining keywords that data should be split in, in this case
> > "Date", "Message", "Sent by" and that those should be the first word
> > on the line and they should be followed by a ":". The Message part in
> > my actual indata is at no risk of containing any of these keywords.
> >
> > Any cute ideas on how to solve that?

>
> I don't how cute it is, but yes, I could solve that using regular
> expressions. Have you made any attempts to solve it yourself yet? If
> you post your best attempt, and describe how that attempt is not
> working for you, we can probably help you fix it.


No, I am currently parsing it by:

if ($body=~m/Message: (.*?)\n/){
my $message = $1;
}

But I want a more modular approach.



--
Sandman[.net]
 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      09-26-2006
Sandman wrote:
> In article <(E-Mail Removed) om>,
> "Paul Lalli" <(E-Mail Removed)> wrote:
>
> > Sandman wrote:
> > > Indata:
> > > -------------------------------------
> > > Date: 2006-04-03
> > > Message: Wonderful! Let's
> > > meet there! I'll call you
> > > later
> > > Sent by: John
> > > -------------------------------------

> >
> > Please speak Perl, not some bizarre pseudo-code. Do you mean:
> > my $Indata = "Date: 2006-04-03
> > Message: Wonderful! Let's
> > meet there! I'll call you
> > later
> > Sent by: John";
> >
> > or do you mean:
> > my @Indata = (
> > "Date: 2006-04-03\n",
> > "Message: Wonderful! Let's\n",
> > "meet there! I'll call you\n",
> > "later\n",
> > "Sent by: John\n"
> > );
> >
> > ?
> >
> > The difference is important.

>
> My indata is a textfile. Sorry.


That completely fails to answer the question. How are you storing this
data *within your program*.


> > > I want this parsed into:
> > >
> > > Array (
> > > [Date] => "2006-04-03",
> > > [Message] => "Wonderful! Let's\nmeet there! ....."
> > > [Sent by] => "John"
> > > )

> >
> > Is this some sort of pseudo-PHP?

>
> No.
>
> > Are you aware you posted to a Perl newsgroup?

>
> Yes. Did you or did you not understand the array composition I was
> looking for?


No, I can only *guess* as to what you meant. My guess may or may not
be correct.

> If you didn't, I would be glad to explain it further as to avoid confusion.


To avoid confusion, just "speak Perl". That way there is no guessing.
Show us an actual Perl data structure that is the result you are
desiring.

> > > By defining keywords that data should be split in, in this case
> > > "Date", "Message", "Sent by" and that those should be the first word
> > > on the line and they should be followed by a ":". The Message part in
> > > my actual indata is at no risk of containing any of these keywords.
> > >
> > > Any cute ideas on how to solve that?

> >
> > I don't how cute it is, but yes, I could solve that using regular
> > expressions. Have you made any attempts to solve it yourself yet? If
> > you post your best attempt, and describe how that attempt is not
> > working for you, we can probably help you fix it.

>
> No, I am currently parsing it by:
>
> if ($body=~m/Message: (.*?)\n/){
> my $message = $1;
> }
>
> But I want a more modular approach.


Presumably, you want an approach that works, too, since the above
doesn't. Even assuming you have more in your if() statement, which
adds the message to your structure, that would stop $1 at the first
line of the Message, rather than where the message actually ends.

Consider matching all non-colons up to an internal end-of-line (take a
look at the /m modifier for RegExps)

Code the attempt, and let us know if it doesn't work.

Paul Lalli

 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      09-26-2006
On 2006-09-26 12:21, Paul Lalli <(E-Mail Removed)> wrote:
> Sandman wrote:
>> In article <(E-Mail Removed) om>,
>> "Paul Lalli" <(E-Mail Removed)> wrote:
>> > Sandman wrote:
>> > > Indata:
>> > > -------------------------------------
>> > > Date: 2006-04-03
>> > > Message: Wonderful! Let's
>> > > meet there! I'll call you
>> > > later
>> > > Sent by: John
>> > > -------------------------------------
>> >
>> > Please speak Perl, not some bizarre pseudo-code. Do you mean:
>> > my $Indata = "Date: 2006-04-03
>> > Message: Wonderful! Let's
>> > meet there! I'll call you
>> > later
>> > Sent by: John";
>> >
>> > or do you mean:
>> > my @Indata = (
>> > "Date: 2006-04-03\n",
>> > "Message: Wonderful! Let's\n",
>> > "meet there! I'll call you\n",
>> > "later\n",
>> > "Sent by: John\n"
>> > );
>> >
>> > ?
>> >
>> > The difference is important.

>>
>> My indata is a textfile. Sorry.

>
> That completely fails to answer the question. How are you storing this
> data *within your program*.


There is no reason why that data should be stored within the program at
all. The file can be read line by line and the array/hash/whatever
datastructure can be constructed on the fly. Slurping the whole file
into memory may make constructing the desired data structure easier
(hard to tell from the vague descriptions Sandman gave us), but it is
certainly not required.

hp

--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR | > ist?
| | | http://www.velocityreviews.com/forums/(E-Mail Removed) | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
 
Reply With Quote
 
Sandman
Guest
Posts: n/a
 
      09-26-2006
In article <(E-Mail Removed). com>,
"Paul Lalli" <(E-Mail Removed)> wrote:

> > My indata is a textfile. Sorry.

>
> That completely fails to answer the question. How are you storing this
> data *within your program*.


If you don't want to help, that's fine. No need to be aggressive. The
way it's stored within the program isn't important. If you assume it's
stored as the content of a variable, work with that. If you don't want
to make any assumptions, don't hit the reply button.

I've been in this group for way too long to be bothered with people
that rather nitpick on syntax than actually trying to help. For
instance, a good response from you would have been something along the
lines of:

Well, if you have the above in, for example, $data, then I would
probably do something like <code>

And my reply them might have been

Thanks, it's not a variable, but read from STDIN, but I can adapt
' your solution to my indata, thanks for helping me out!

Thanks for listening.


--
Sandman[.net]
 
Reply With Quote
 
Sandman
Guest
Posts: n/a
 
      09-26-2006
In article <3h9Sg.6492$(E-Mail Removed). net>,
"Mumia W. (reading news)" <(E-Mail Removed)>
wrote:

> I would use the substitution operator s/// to repeatedly suck off
> keyword and value segments and place them in a hash. The /e option to
> s/// allows you execute complicated expressions, and that's what I would
> use here.
>
> Try it yourself.


Yeah, that's pretty much how I've been doing it. I just thought that
there were a more modular approach. I'll try some more. Thanks



--
Sandman[.net]
 
Reply With Quote
 
anno4000@radom.zrz.tu-berlin.de
Guest
Posts: n/a
 
      09-26-2006
Sandman <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> In article <3h9Sg.6492$(E-Mail Removed). net>,
> "Mumia W. (reading news)" <(E-Mail Removed)>
> wrote:
>
> > I would use the substitution operator s/// to repeatedly suck off
> > keyword and value segments and place them in a hash. The /e option to
> > s/// allows you execute complicated expressions, and that's what I would
> > use here.
> >
> > Try it yourself.

>
> Yeah, that's pretty much how I've been doing it. I just thought that
> there were a more modular approach. I'll try some more. Thanks


You've said that twice now. "Modular" means consisting of independent
components. How does that apply here?

Anno
 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      09-26-2006
Sandman wrote:
> In article <(E-Mail Removed). com>,
> "Paul Lalli" <(E-Mail Removed)> wrote:
>
> > > My indata is a textfile. Sorry.

> >
> > That completely fails to answer the question. How are you storing this
> > data *within your program*.

>
> If you don't want to help, that's fine. No need to be aggressive.


While I was not being agressive, I rather disagree that there was no
need to be. For some reason, you seem completely unwilling to help
anyone to help you without mulitple prodding.

> The
> way it's stored within the program isn't important


Of course it is. If it's stored in a scalar variable, there are
certain operations you can do on it. If it's stored as a list of
lines, there are other options you can do on it. How is that not
relevant?

>. If you assume it's
> stored as the content of a variable, work with that. If you don't want
> to make any assumptions, don't hit the reply button.


My point is that there is NO REASON to make any assumptions, neither on
my part nor on yours. You clearly are reading the file at some point
in your current script, so why not just tell us *how* you're doing so?!

> I've been in this group for way too long to be bothered with people
> that rather nitpick on syntax than actually trying to help.


I was trying to help. I was trying to help you see how to ask a
question that would be likely to produce a response that would solve
your problem. How is that not helpful?

> For
> instance, a good response from you would have been something along the
> lines of:
>
> Well, if you have the above in, for example, $data, then I would
> probably do something like <code>


No, that would be a REALLY REALLY bad response, because it would
encourage you to continue to post badly formed questions with no
attempt to solve the problem on your own, and would only increase the
number of people who refuse to help you. That would NOT help you in
the long run at all.

Paul Lalli

 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      09-26-2006
Peter J. Holzer wrote:
> On 2006-09-26 12:21, Paul Lalli <(E-Mail Removed)> wrote:
> > Sandman wrote:
> >> In article <(E-Mail Removed) om>,
> >> "Paul Lalli" <(E-Mail Removed)> wrote:
> >> > Sandman wrote:
> >> > > Indata:
> >> > > -------------------------------------
> >> > > Date: 2006-04-03
> >> > > Message: Wonderful! Let's
> >> > > meet there! I'll call you
> >> > > later
> >> > > Sent by: John


> >> My indata is a textfile. Sorry.

> >
> > That completely fails to answer the question. How are you storing this
> > data *within your program*.

>
> There is no reason why that data should be stored within the program at
> all. The file can be read line by line and the array/hash/whatever
> datastructure can be constructed on the fly. Slurping the whole file
> into memory may make constructing the desired data structure easier
> (hard to tell from the vague descriptions Sandman gave us), but it is
> certainly not required.


I was working on the assumption that the text file really is just 5
lines as the OP showed. In that case, the "penalty" for slurping the
entire file is less than negligable, and the benefits of not having to
parse each line looking for the end of the record, storing the previous
line, joining multiple lines to complete the record, etc, are far more
than worth it.

Paul Lalli

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Splitting a line while keeping quoted items together josh@merchantconcepts.com Python 1 11-20-2012 01:09 AM
Re: Splitting text at whitespace but keeping the whitespace in thereturned list MRAB Python 3 01-26-2010 11:36 PM
splitting with a regex & keeping a ref? Kyle Schmitt Ruby 11 05-02-2008 02:48 AM
Splitting string into array keeping delimiters Gary C40 Ruby 6 12-16-2007 10:46 AM
Splitting and keeping the delimiter Sandman Perl Misc 7 09-12-2003 12:40 PM



Advertisments