Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Re: Is there a better way than this?

Reply
Thread Tools

Re: Is there a better way than this?

 
 
Helmut Richter
Guest
Posts: n/a
 
      06-04-2013
On Tue, 4 Jun 2013, Dave Stratford wrote:

> My requirements are to extract the initial letter(s) from the outcode. So
> if a user entered HP13, I want to extract just the HP, equally if they
> emtered WC1A, I want just the WC.
>
> My current code, which works perfectly fine, looks like this:
>
> my $oc = substr($outcode,0,2);
> my $ocr = substr($oc,1,1);
>
> $oc = substr($outcode,0,1) if ($ocr =~ /\d/);


What about

$outcode =~ /^([A-Z]+)\d/;
$oc = $1;

--
Helmut Richter
 
Reply With Quote
 
 
 
 
Rainer Weikusat
Guest
Posts: n/a
 
      06-04-2013
Helmut Richter <(E-Mail Removed)> writes:
> On Tue, 4 Jun 2013, Dave Stratford wrote:
>
>> My requirements are to extract the initial letter(s) from the outcode. So
>> if a user entered HP13, I want to extract just the HP, equally if they
>> emtered WC1A, I want just the WC.
>>
>> My current code, which works perfectly fine, looks like this:
>>
>> my $oc = substr($outcode,0,2);
>> my $ocr = substr($oc,1,1);
>>
>> $oc = substr($outcode,0,1) if ($ocr =~ /\d/);

>
> What about
>
> $outcode =~ /^([A-Z]+)\d/;
> $oc = $1;


If the regex didn't match, $oc will now contain the value captured by
the last successful regex match before this line. This may not be a
problem in this case but in general, it is prudent to use the $n only when the
corresponding match was successful. Assuming that $oc is guaranteed to
be intially uninitialized, I would use something like

$outcode =~ /^([A-Z]{1,2})\d/ and $oc = $1;[*]

$oc will now have the correct value if outcode really started with one
or two letters followed by a digit. Depending on where the input came
from, [0-9] instead of \d might be a better choice as UK postcodes
certainly use arabic numerals but \d might match anything the Unicode
consortium considers a digit now or in future.
[*]

The obivious other way to write this is

($oc) = $outcode =~ /^([A-Z]{1,2})\d/;
 
Reply With Quote
 
 
 
 
Ted Zlatanov
Guest
Posts: n/a
 
      06-05-2013
On Tue, 4 Jun 2013 20:15:33 +0100 Ben Morrow <(E-Mail Removed)> wrote:

BM> If you have 5.14 you can squash this particular bit of stupidity with
BM> the /a switch. /\d/a, /\w/a and /\s/a all match what they used to in
BM> 5.6. (Except that I believe in 5.18 \s matches \v, that is, vertical
BM> tab, both with and without /a.)

BM> You can also turn on /a for all patterns in a lexical scope with

BM> use re "/a";

Neat, I didn't notice that one.

Is there a way to test for this feature, or do I just have to require 5.14?

Ted
 
Reply With Quote
 
Ted Zlatanov
Guest
Posts: n/a
 
      06-06-2013
On Wed, 5 Jun 2013 20:22:05 +0100 Ben Morrow <(E-Mail Removed)> wrote:

BM> If you're happy relying on perl versions rather than the re module
BM> itself (probably pretty safe) you could use

BM> use if $] >= 5.014, re => "/a";

That's really useful, thanks.

Ted
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      06-06-2013
Ben Morrow <(E-Mail Removed)> writes:
> Quoth Rainer Weikusat <(E-Mail Removed)>:


[code using $1]

>> The obivious other way to write this is
>>
>> ($oc) = $outcode =~ /^([A-Z]{1,2})\d/;

>
> I would consider this better style; there's no point mucking about with
> magic globals if you don't need to.


In this case, the point would be 'they are available and using them is
convenient'. Eg, assuming some kind of error handling was intended,
this could be written as

if ($outcode =~ /^([A-Z]{1,2})\d/) {
# use the captured value here
} else {
# complain
}

Of course,

if (($oc) = $outcode =~ /^([A-Z]{1,2})\d/) {
# use the captured value here
} else {
# complain
}

would work as well but it is more complicated expression and IMO,
if-expressions with side-effects should be avoided. Also, a variable
needs to be declared for everything which is supposed to be captured
via return value but the individual captured expression might not be
useful on their own. Contrived example:

sub hton_postcode
{
return $_[0] =~ /(...)\s+(...)/ && "$2 $1";
}

This is even 'safe' in the sense that it won't affect the $n of the
caller because the capture-variables are always automatically local to
the enclosing block.
 
Reply With Quote
 
Tim McDaniel
Guest
Posts: n/a
 
      06-08-2013
In article <(E-Mail Removed)>,
Dave Stratford <(E-Mail Removed)> wrote:
>In article <(E-Mail Removed) >,
> Rainer Weikusat <(E-Mail Removed)> wrote:
>> Helmut Richter <(E-Mail Removed)> writes:
>> > On Tue, 4 Jun 2013, Dave Stratford wrote:
>> >
>> >> My requirements are to extract the initial letter(s) from the
>> >> outcode. So if a user entered HP13, I want to extract just the HP,
>> >> equally if they emtered WC1A, I want just the WC.
>> >>
>> >> My current code, which works perfectly fine, looks like this:
>> >>
>> >> my $oc = substr($outcode,0,2);
>> >> my $ocr = substr($oc,1,1);
>> >>
>> >> $oc = substr($outcode,0,1) if ($ocr =~ /\d/);
>> >
>> > What about
>> >
>> > $outcode =~ /^([A-Z]+)\d/;
>> > $oc = $1;

>>

....
>> The obivious other way to write this is

>
>> ($oc) = $outcode =~ /^([A-Z]{1,2})\d/;

>

....
>I've gone with
>$oc =~ /^([A-Z)+)/a;


Did you typo "$oc = " as "$oc =~ "? Because the =~ form would change
the meaning from all the suggestions: it would pattern-match against
$oc and the matching part would go into $1, which has already been
suggested against.

Also: I am not familiar with /a and friends, but I think it's useless
here. Looking at the next RE, did you mean /i instead?

>sub valid_outcode
>{
> my $str = shift;
> my $regex =
>qr{^([BGLMNS][1-9][0-9]?|[A-PR-UWYZ][A-HK-Y][1-9]?[0-9]|([EW]C?|NW?|S[EW])[1-9][0-9A-HJKMNPR-Y])$}i;
>
> return ($str =~ $regex);
>}


That returns the value of the match differently in list context from
scalar context, so in an array context the caller would get random
bits of the pattern match. I would code it as returning a boolean,
which could be written

return ($str =~ $regex ? 1 : 0);

(at a quick look, I can't figure out the precedence rules in perlop to
know whether the parens could be omitted) or more succinctly

return scalar ($str =~ $regex);

return !!($str =~ $regex);


>I realise that I could probably have used $_ rather than "my $str =
>shift;", but I just prefer to do it this way, that way I remember
>what I dealing with, particularly on larger subs.


I like that too.

>> ($oc) = $outcode =~ /^([A-Z]{1,2})\d/;

>
>Actually Rainer, why did you do ($oc)? I understand that the () in
>this context returns a list?


More precisely, it causes the left-hand side to be a list context, so
the right-hand side is evaluated in a list context. Out of
"man perlop":

Binary "=~" binds a scalar expression to a pattern match.
... When used in scalar context, the return value generally
indicates the success of the operation. ...


m/PATTERN/msixpodualgc
/PATTERN/msixpodualgc
...
Matching in list context

If the "/g" option is not used, "m//" in list context returns a
list consisting of the subexpressions matched by the parentheses
in the pattern, i.e., ($1, $2, $3...). (Note that here $1
etc. are also set, and that this differs from Perl 4's behavior.)
When there are no parentheses in the pattern, the return value is
the list "(1)" for success. With or without parentheses, an empty
list is returned upon failure.

So
$oc = $outcode =~ /^([A-Z]{1,2})\d/;
would set $oc to merely true or false.
($oc) = $outcode =~ /^([A-Z]{1,2})\d/;
assigns the matched letters to it (or undef if it doesn't match).

--
Tim McDaniel, http://www.velocityreviews.com/forums/(E-Mail Removed)

 
Reply With Quote
 
Tim McDaniel
Guest
Posts: n/a
 
      06-09-2013
In article <(E-Mail Removed)>,
Ben Morrow <(E-Mail Removed)> wrote:
>
>Quoth (E-Mail Removed):
>> return ($str =~ $regex ? 1 : 0);
>>
>> (at a quick look, I can't figure out the precedence rules in perlop
>> to know whether the parens could be omitted)

>
> ~% perl -MO=Deparse,-p -e'return $s =~ $r ? 1 : 0'
> (return (($str =~ /$rx/) ? 1 : 0));
> -e syntax OK
>
>So, no.


Um, so, yes, the parens can be omitted.

>'return' is a list operator


For future reference, where can I find that sort of thing documented?

>> or more succinctly
>>
>> return scalar ($str =~ $regex);

>
>That doesn't need the brackets either, for the same reason.


?: is not necessarily handled the same as scalar, so a priori to my
sort of non-cognoscenti it's not necessarily the same situation. I
think scalar is a named scalar operator, to go by the example in
perlfunc scalar:

# perl -MO=Deparse,-p -e'@counts = ( scalar @a, scalar @b, scalar @c);'
(@counts = (scalar(@a), scalar(@b), scalar(@c)));
-e syntax OK

But perlop says that =~ binds more closely than named unary
operators, so it's all OK anyway:

$ perl -MO=Deparse,-p -e'return scalar $str =~ $regex'
return(scalar(($str =~ /$regex/)));
-e syntax OK

--
Tim McDaniel, (E-Mail Removed)
 
Reply With Quote
 
Tim McDaniel
Guest
Posts: n/a
 
      06-09-2013
In article <(E-Mail Removed)>,
Ben Morrow <(E-Mail Removed)> wrote:
>It's perhaps worth drawing a little more attention to this, since
>it's a special case in the Perl parser and not a normal use of
>brackets for precedence. Putting a single scalar term on the LHS of
>an = in a set of unnecessary brackets explicitly turns the assignment
>into a list assignment.
>
> $oc = # scalar assignment
> ($oc) = # list assignment


Took me years to learn that.

To amplify on that point:

For example, suppose you (not you == Ben, you == one) do a common
idiom for handling args:

my ($file, $pattern, $count) = @ARGV;

Suppose you need only one argument. If you do

my $base = @ARGV;

$base will get the number of elements of arguments, not the first one.
You need

my ($base) = @ARGV;

with what you might think are redundant parens there, but (as Ben
mentioned) they're not, they're providing list context. It's a list
assignment and the left-hand side gets "my"ed, with the same effect as

my $base;
($base) = @ARGV;

As "man perlsub" puts it,

The "my" is simply a modifier on something you might assign to.
So when you do assign to variables in its argument list, "my"
doesn't change whether those variables are viewed as a scalar or
an array.

--
Tim McDaniel, (E-Mail Removed)
 
Reply With Quote
 
Tim McDaniel
Guest
Posts: n/a
 
      06-10-2013
In article <(E-Mail Removed)>,
Ben Morrow <(E-Mail Removed)> wrote:
>
>Quoth (E-Mail Removed):
>> In article <(E-Mail Removed)>,
>> Ben Morrow <(E-Mail Removed)> wrote:
>> >'return' is a list operator

>>
>> For future reference, where can I find that sort of thing
>> documented?


I meant for builtins.

>... builtins which take a LIST argument in perlfunc are list
>operators


return EXPR

return

Returns from a subroutine, eval, or do FILE with the value given
in EXPR. Evaluation of EXPR may be in list, scalar, or void
context, depending on how the return value will be used, ...

doesn't say LIST, but you say it's a list operator, and that makes
ense because it can return lists.

--
Tim McDaniel, (E-Mail Removed)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Who can explain this bug? mathog C Programming 57 06-11-2013 10:09 PM
Lightroom Export, am I doing this the hard way? J. Clarke Digital Photography 0 05-05-2013 08:15 PM
Really throwing this out there - does anyone have a copy of my oldDancer web browser? steven.miale@gmail.com Python 1 04-10-2013 03:32 PM
GL2 better than the XLs? Consumer grade HDs better than pro-sumer Mini DVs? dh@. DVD Video 1 08-28-2008 07:20 PM
Is splint really better than lint? Is there a better tool than splint? Peter Bencsik C Programming 2 09-21-2006 10:02 PM



Advertisments