Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Text parsing and substitution

Reply
Thread Tools

Text parsing and substitution

 
 
maheshpop1@gmail.com
Guest
Posts: n/a
 
      05-19-2006
Hi guys,

I am doing this module where I am gonna change the following sentence

"1:action=commit:user=joe:date=2005-02-02:"
"2:action=checkout:user=mark:date=2005-02-03:"

to something like
" 1. Commits by user Joe on date 2005-02-02 "
" 2. Checkouts by user Joe on date 2005-02-03"

making the above text a little bit more readable to the user. I started
of with a program which finds out the different key value pairs and
and based on the values append/create a string with approriate words
like

pseudocode only

parse the line,
load a hashmap with the key, value pairs
if(hash{action}=='commit') <---this is a mandatory field
string.="Commits"
if(defined hash{user})
string.="by hash{user})
if(defined hash{date})
string.="on date hash{date}"
...................................
...................................
if(hash{action}=='checkout') <---this is a mandatory field
string.="Commits"
if(defined hash{user})
string.="by hash{user})
if(defined hash{date})
string.="on date hash{date}"
..............................................
.............................................
I was thinking this sort of logic but a little apprehensive how elastic
it can be as I would be addressing so many actions and seperate if
blocks for all of them. Any suggestions or ideas on how to better
achieve what I want to do above.

cheers,
pop.

 
Reply With Quote
 
 
 
 
Guest
Posts: n/a
 
      05-19-2006
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
: Hi guys,

: I am doing this module where I am gonna change the following sentence

: "1:action=commit:user=joe:date=2005-02-02:"
: "2:action=checkout:user=mark:date=2005-02-03:"

: to something like

: " 1. Commits by user Joe on date 2005-02-02 "
: " 2. Checkouts by user Joe on date 2005-02-03"


Check whether all your data follow the same pattern and obey the same
constraints. Apparently you'r doing something in fields here, so:

$rawtext="1:action=commit:user=joe:date=2005-02-02:";
($no,$rawaction,$rawuser,$rawdate)=split(/:/,$rawtext);

# Treat each raw element like this:
($nil,$user)=split(/=/,$rawuser);

# Keep a hash for full user names (and for actions as well):

%users(
"joe" => "Joe",
"dan" => "Daniel",
...
);

# Build your phrase in free English, like:

print "On $date, user $users{$user} $actions{$action}...";

Hth,

Oliver.


--
Dr. Oliver Corff e-mail: http://www.velocityreviews.com/forums/(E-Mail Removed)-berlin.de
 
Reply With Quote
 
 
 
 
Tad McClellan
Guest
Posts: n/a
 
      05-19-2006
(E-Mail Removed) <(E-Mail Removed)> wrote:

> I am doing this module where I am gonna change the following sentence
>
> "1:action=commit:user=joe:date=2005-02-02:"
> "2:action=checkout:user=mark:date=2005-02-03:"

^^^^
> to something like
> " 1. Commits by user Joe on date 2005-02-02 "
> " 2. Checkouts by user Joe on date 2005-02-03"

^^^

Why did mark's name change to Joe?

Why a trailing space in the 1st one but not in the 2nd one?

Why one space in the 1st one but 2 spaces in the 2nd one?

Are those double quotes actually in your data, or are they
meant to be "meta"?


> pseudocode only



Why?

It takes only a tiny bit of effort to bypass the confusion
caused by the pseudoness.

The value of the answer you can expect to receive is directly
proportional to the effort you put into forming your question...


> if(hash{action}=='commit') <---this is a mandatory field



if( $hash{$action} eq 'commit' ) <---this is a mandatory field

There, that wasn't very hard now was it?


> Any suggestions or ideas on how to better
> achieve what I want to do above.


----------------------------------
#!/usr/bin/perl
use warnings;
use strict;

while ( <DATA> ) {
chomp;
chop; # don't need final colon
my($num, %attrs) = split /[:=]/;
$attrs{action} .= 's'; # pluralize
s/(.)/\u$1/ for values %attrs; # upper case 1st letter
printf "%2d. %s by user %s on date %s\n",
$num, @attrs{ qw/action user date/ };
}

__DATA__
1:action=commit:user=joe:date=2005-02-02:
2:action=checkout:user=mark:date=2005-02-03:
----------------------------------


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Dr.Ruud
Guest
Posts: n/a
 
      05-19-2006
(E-Mail Removed) schreef:

> change the following sentence
>
> "1:action=commit:user=joe:date=2005-02-02:"
> "2:action=checkout:user=mark:date=2005-02-03:"
>
> to something like
> " 1. Commits by user Joe on date 2005-02-02 "
> " 2. Checkouts by user Joe on date 2005-02-03"



This assumes that the fields are allways in the same order:

#!/usr/bin/perl
use strict;
use warnings;

while ( <DATA> )
{
s{ ^ ([^:]+)
: (action) = ([^:]+)
: (user) = ([^:]+)
: (date) = ([^:]+)
:
}
{$1. \u$3s by $4 \u$5 on $6 $7}x
and print
}

__DATA__
1:action=commit:user=joe:date=2005-02-02:
2:action=checkout:user=mark:date=2005-02-03:

--
Affijn, Ruud

"Gewoon is een tijger."


 
Reply With Quote
 
DJ Stunks
Guest
Posts: n/a
 
      05-19-2006

Tad McClellan wrote:
> #!/usr/bin/perl
> use warnings;
> use strict;
>
> while ( <DATA> ) {
> chomp;
> chop; # don't need final colon


not necessary, split will not include it as empty trailing fields are
deleted.

> my($num, %attrs) = split /[:=]/;


very nice, I always seem to forget that you can initialize a hash with
a list in that way.

> $attrs{action} .= 's'; # pluralize
> s/(.)/\u$1/ for values %attrs; # upper case 1st letter


how about:
ucfirst for values %attrs;

> printf "%2d. %s by user %s on date %s\n",
> $num, @attrs{ qw/action user date/ };
> }
>
> __DATA__
> 1:action=commit:user=joe:date=2005-02-02:
> 2:action=checkout:user=mark:date=2005-02-03:
> ----------------------------------


-jp

 
Reply With Quote
 
DJ Stunks
Guest
Posts: n/a
 
      05-19-2006

DJ Stunks wrote:
> Tad McClellan wrote:
> > s/(.)/\u$1/ for values %attrs; # upper case 1st letter

>
> how about:
> ucfirst for values %attrs;


um.....?

$_ = ucfirst for values %attrs;

$credibility{jpeavy1}--;

-jp

 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      05-19-2006
DJ Stunks <(E-Mail Removed)> wrote:
> Tad McClellan wrote:



>> s/(.)/\u$1/ for values %attrs; # upper case 1st letter

>
> how about:
> ucfirst for values %attrs;



That is a lot better than what I had...

.... except that it doesn't work.

$_ = ucfirst for values %attrs;


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Guest
Posts: n/a
 
      05-19-2006
Tad McClellan <(E-Mail Removed)> wrote:

: s/(.)/\u$1/ for values %attrs; # upper case 1st letter

Couldn't this be simplified to:

: s/./\u$&/ for values %attrs; # upper case 1st letter

?

Oliver.
--
Dr. Oliver Corff e-mail: (E-Mail Removed)-berlin.de
 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      05-19-2006
<(E-Mail Removed)-berlin.de> <(E-Mail Removed)-berlin.de> wrote:
> Tad McClellan <(E-Mail Removed)> wrote:
>
>: s/(.)/\u$1/ for values %attrs; # upper case 1st letter
>
> Couldn't this be simplified to:
>
>: s/./\u$&/ for values %attrs; # upper case 1st letter
>
> ?



Yes, but cycles are a terrible thing to waste.

(See $& in perlvar.pod and elsewhere.)


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Dr.Ruud
Guest
Posts: n/a
 
      05-19-2006
(E-Mail Removed)-berlin.de schreef:
> Tad McClellan:


>> s/(.)/\u$1/ for values %attrs; # upper case 1st letter

>
> Couldn't this be simplified to:
>
> s/./\u$&/ for values %attrs; # upper case 1st letter


It is not simpler. It might be a tad slower.

Alternatives:

$_ = "\u$_" for values %attrs ;

$_ = ucfirst for values %attrs ;

--
Affijn, Ruud

"Gewoon is een tijger."


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
about text substitution. SpreadTooThin HTML 3 11-17-2011 05:29 PM
SAX parsing problem, when element contains text like "[text]" Kai Schlamp Java 1 03-27-2008 08:36 PM
text substitution pula58 Perl Misc 3 05-24-2007 02:52 AM
Weird text substitution Jim Beaver Computer Support 5 02-13-2004 06:29 AM
Substitution and Text Within Parentheses Addy Perl Misc 2 08-26-2003 01:15 PM



Advertisments