Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > handle tab-delimited file

Reply
Thread Tools

handle tab-delimited file

 
 
Ela
Guest
Posts: n/a
 
      03-15-2008
\t matches BOTH tab and space.

How can I split the following line into 2 words instead of 5?

1234\tI am a boy\n


 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      03-15-2008
Ela wrote:
> \t matches BOTH tab and space.


No, it doesn't.

> How can I split the following line into 2 words instead of 5?
>
> 1234\tI am a boy\n


split /\t/, "1234\tI am a boy\n"

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
 
 
 
Tad J McClellan
Guest
Posts: n/a
 
      03-15-2008
Ela <(E-Mail Removed)> wrote:

> \t matches BOTH tab and space.



No it doesn't.

\s matches tab and space (and 3 other characters).

Is that what you meant?

(we wouldn't need to ask this if you had posted real Perl code.)


> How can I split the following line into 2 words instead of 5?
>
> 1234\tI am a boy\n



use PSI::ESP;

By spliting on \t rather than spliting on \s


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
 
Reply With Quote
 
Ben Bullock
Guest
Posts: n/a
 
      03-16-2008
On Sat, 15 Mar 2008 14:10:12 +0000, Tad J McClellan wrote:

> \s matches tab and space (and 3 other characters).


Don't forget your Ogham space mark:

#!/usr/bin/perl
use warnings;
use strict;
use Unicode::UCD 'charinfo';
sub count_match
{
my ($re)=@_;
my $c;
for my $n (0x00 .. 0xD7FF, 0xE000 .. 0xFDCF, 0xFDF0.. 0xFFFD) {
if (chr($n) =~ /$re/) {
my $ci = charinfo($n);
print sprintf ('%02X', $n), " which is ", $$ci{name}
, " matches\n";
$c++;
}
}
print "There are $c characters matching \"$re\".\n";
}
count_match('\s');

which gives:

09 which is <control> matches
0A which is <control> matches
0C which is <control> matches
0D which is <control> matches
20 which is SPACE matches
1680 which is OGHAM SPACE MARK matches
180E which is MONGOLIAN VOWEL SEPARATOR matches
2000 which is EN QUAD matches
2001 which is EM QUAD matches
2002 which is EN SPACE matches
2003 which is EM SPACE matches
2004 which is THREE-PER-EM SPACE matches
2005 which is FOUR-PER-EM SPACE matches
2006 which is SIX-PER-EM SPACE matches
2007 which is FIGURE SPACE matches
2008 which is PUNCTUATION SPACE matches
2009 which is THIN SPACE matches
200A which is HAIR SPACE matches
2028 which is LINE SEPARATOR matches
2029 which is PARAGRAPH SEPARATOR matches
202F which is NARROW NO-BREAK SPACE matches
205F which is MEDIUM MATHEMATICAL SPACE matches
3000 which is IDEOGRAPHIC SPACE matches
There are 23 characters matching "\s".
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Package to handle table text render (handle space or tab betweenthe columns) ? =?ISO-8859-1?Q?KLEIN_St=E9phane?= Python 3 10-06-2006 08:46 AM
Possible to handle web requests without an ASPX page? i.e. have DLL handle request. jdlwright@shaw.ca ASP .Net 2 05-31-2005 05:42 PM
how to handle command line output(not terminal handle) Leon Python 2 11-04-2004 05:16 AM
File Handle Reading Blues: Rereading a File Handle for Input Dietrich Perl 1 07-22-2004 10:02 AM
File pointer to file handle Apollyon C Programming 5 05-02-2004 11:35 AM



Advertisments