Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Discussion on some Code Issues

Reply
Thread Tools

Discussion on some Code Issues

 
 
subhabangalore@gmail.com
Guest
Posts: n/a
 
      07-08-2012
On Sunday, July 8, 2012 2:21:14 AM UTC+5:30, Dennis Lee Bieber wrote:
> On Sat, 7 Jul 2012 12:54:16 -0700 (PDT),
> declaimed the following in gmane.comp.python.general:
>
> > But I am bit intrigued with another question,
> >
> > suppose I say:
> > file_open=open("/python32/doc1.txt","r")
> > file=a1.read().lower()
> > for line in file:
> > line_word=line.split()
> >
> > This works fine. But if I print it would be printed continuously.

>
> "This works fine" -- Really?
>
> 1) Why are you storing data files in the install directory of your
> Python interpreter?
>
> 2) "a1" is undefined -- you should get an exception on that line which
> makes the following irrelevant; replacing "a1" with "file_open" leads
> to...
>
> 3) "file" is a) a predefined function in Python, which you have just
> shadowed and b) a poor name for a string containing the contents of a
> file
>
> 4) "for line in file", since "file" is a string, will iterate over EACH
> CHARACTER, meaning (since there is nothing to split) that "line_word" is
> also just a single character.
>
> for line in file.split("\n"):
>
> will split the STRING into logical lines (assuming a new-line character
> splits the lines) and permit the subsequent split to pull out wordS
> ("line_word" is misleading, as to will contain a LIST of words from the
> line).
>
> > I like to store in some variable,so that I may print line of my choice and manipulate them at my choice.
> > Is there any way out to this problem?
> >
> >
> > Regards,
> > Subhabrata Banerjee

> --
> Wulfraed Dennis Lee Bieber AF6VN
> HTTP://wlfraed.home.netcom.com/


Thanks for pointing out the mistakes. Your points are right. So I am trying to revise it,

file_open=open("/python32/doc1.txt","r")
for line in file_open:
line_word=line.split()
print (line_word)

To store them the best way is to assign a blank list and append but is there any alternate
method for huge data it becomes tough as the list becomes huge if any way variables may be assigned.

Regards,
Subhabrata Banerjee.


 
Reply With Quote
 
 
 
 
subhabangalore@gmail.com
Guest
Posts: n/a
 
      07-08-2012
On Sunday, July 8, 2012 2:21:14 AM UTC+5:30, Dennis Lee Bieber wrote:
> On Sat, 7 Jul 2012 12:54:16 -0700 (PDT),
> declaimed the following in gmane.comp.python.general:
>
> > But I am bit intrigued with another question,
> >
> > suppose I say:
> > file_open=open("/python32/doc1.txt","r")
> > file=a1.read().lower()
> > for line in file:
> > line_word=line.split()
> >
> > This works fine. But if I print it would be printed continuously.

>
> "This works fine" -- Really?
>
> 1) Why are you storing data files in the install directory of your
> Python interpreter?
>
> 2) "a1" is undefined -- you should get an exception on that line which
> makes the following irrelevant; replacing "a1" with "file_open" leads
> to...
>
> 3) "file" is a) a predefined function in Python, which you have just
> shadowed and b) a poor name for a string containing the contents of a
> file
>
> 4) "for line in file", since "file" is a string, will iterate over EACH
> CHARACTER, meaning (since there is nothing to split) that "line_word" is
> also just a single character.
>
> for line in file.split("\n"):
>
> will split the STRING into logical lines (assuming a new-line character
> splits the lines) and permit the subsequent split to pull out wordS
> ("line_word" is misleading, as to will contain a LIST of words from the
> line).
>
> > I like to store in some variable,so that I may print line of my choice and manipulate them at my choice.
> > Is there any way out to this problem?
> >
> >
> > Regards,
> > Subhabrata Banerjee

> --
> Wulfraed Dennis Lee Bieber AF6VN
> HTTP://wlfraed.home.netcom.com/


Thanks for pointing out the mistakes. Your points are right. So I am trying to revise it,

file_open=open("/python32/doc1.txt","r")
for line in file_open:
line_word=line.split()
print (line_word)

To store them the best way is to assign a blank list and append but is there any alternate
method for huge data it becomes tough as the list becomes huge if any way variables may be assigned.

Regards,
Subhabrata Banerjee.


 
Reply With Quote
 
 
 
 
Chris Angelico
Guest
Posts: n/a
 
      07-08-2012
On Sun, Jul 8, 2012 at 3:42 PM, <> wrote:
> Thanks for pointing out the mistakes. Your points are right. So I am trying to revise it,
>
> file_open=open("/python32/doc1.txt","r")
> for line in file_open:
> line_word=line.split()
> print (line_word)


Yep. I'd be inclined to rename file_open to something that says what
the file _is_, and you may want to look into the 'with' statement to
guarantee timely closure of the file, but that's a way to do it.

Also, as has already been mentioned: keeping your data files in the
Python binaries directory isn't usually a good idea. More common to
keep them in the same directory as your script, which would mean that
you don't need a path on it at all.

ChrisA
 
Reply With Quote
 
subhabangalore@gmail.com
Guest
Posts: n/a
 
      07-08-2012
On Sunday, July 8, 2012 1:33:25 PM UTC+5:30, Chris Angelico wrote:
> On Sun, Jul 8, 2012 at 3:42 PM, <> wrote:
> > Thanks for pointing out the mistakes. Your points are right. So I am trying to revise it,
> >
> > file_open=open("/python32/doc1.txt","r")
> > for line in file_open:
> > line_word=line.split()
> > print (line_word)

>
> Yep. I'd be inclined to rename file_open to something that says what
> the file _is_, and you may want to look into the 'with' statement to
> guarantee timely closure of the file, but that's a way to do it.
>
> Also, as has already been mentioned: keeping your data files in the
> Python binaries directory isn't usually a good idea. More common to
> keep them in the same directory as your script, which would mean that
> you don't need a path on it at all.
>
> ChrisA


Dear Chirs,
No file path! Amazing. I do not know I like to know one small example please.
Btw, some earlier post said, line.split() to convert line into bag of words can be done with power(), but I did not find it, if any one can help. I do close files do not worry. New style I'd try.

Regards,
Subha
 
Reply With Quote
 
subhabangalore@gmail.com
Guest
Posts: n/a
 
      07-08-2012
On Sunday, July 8, 2012 1:33:25 PM UTC+5:30, Chris Angelico wrote:
> On Sun, Jul 8, 2012 at 3:42 PM, <> wrote:
> > Thanks for pointing out the mistakes. Your points are right. So I am trying to revise it,
> >
> > file_open=open("/python32/doc1.txt","r")
> > for line in file_open:
> > line_word=line.split()
> > print (line_word)

>
> Yep. I'd be inclined to rename file_open to something that says what
> the file _is_, and you may want to look into the 'with' statement to
> guarantee timely closure of the file, but that's a way to do it.
>
> Also, as has already been mentioned: keeping your data files in the
> Python binaries directory isn't usually a good idea. More common to
> keep them in the same directory as your script, which would mean that
> you don't need a path on it at all.
>
> ChrisA


Dear Chirs,
No file path! Amazing. I do not know I like to know one small example please.
Btw, some earlier post said, line.split() to convert line into bag of words can be done with power(), but I did not find it, if any one can help. I do close files do not worry. New style I'd try.

Regards,
Subha
 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      07-08-2012
On Mon, Jul 9, 2012 at 3:05 AM, <> wrote:
> On Sunday, July 8, 2012 1:33:25 PM UTC+5:30, Chris Angelico wrote:
>> On Sun, Jul 8, 2012 at 3:42 PM, <> wrote:
>> > file_open=open("/python32/doc1.txt","r")

>> Also, as has already been mentioned: keeping your data files in the
>> Python binaries directory isn't usually a good idea. More common to
>> keep them in the same directory as your script, which would mean that
>> you don't need a path on it at all.

> No file path! Amazing. I do not know I like to know one small example please.


open("doc1.txt","r")

Python will look for a file called doc1.txt in the directory you run
the script from (which is often going to be the same directory as your
..py program).

> Btw, some earlier post said, line.split() to convert line into bag of words can be done with power(), but I did not find it, if any one can help. I do close files do not worry. New style I'd try.


I don't know what power() function you're talking about, and can't
find it in the previous posts; the nearest I can find is a post from
Ranting Rick which says a lot of guff that you can ignore. (Rick is a
professional troll. Occasionally he says something useful and
courteous; more often it's one or the other, or neither.)

As to the closing of files: There are a few narrow issues that make it
worth using the 'with' statement, such as exceptions; mostly, it's
just a good habit to get into. If you ignore it, your file will
*usually* be closed fairly soon after you stop referencing it, but
there's no guarantee. (Someone else will doubtless correct me if I'm
wrong, but I'm pretty sure Python guarantees to properly flush and
close on exit, but not necessarily before.)

ChrisA
 
Reply With Quote
 
Roy Smith
Guest
Posts: n/a
 
      07-08-2012
In article <mailman.1922.1341767824.4697.python->,
Chris Angelico <> wrote:

> open("doc1.txt","r")
>
> Python will look for a file called doc1.txt in the directory you run
> the script from (which is often going to be the same directory as your
> .py program).


Well, to pick a nit, the file will be looked for in the current working
directory. This may or may not be the directory you ran your script
from. Your script could have executed chdir() between the time you
started it and you tried to open the file.

To pick another nit, it's misleading to say, "Python will look for...".
This implies that Python somehow gets involved in pathname resolution,
when it doesn't. Python just passes paths to the operating system as
opaque strings, and the OS does all the magic of figuring out what that
string means.
 
Reply With Quote
 
MRAB
Guest
Posts: n/a
 
      07-08-2012
On 08/07/2012 18:17, Chris Angelico wrote:
> On Mon, Jul 9, 2012 at 3:05 AM, <> wrote:
>> On Sunday, July 8, 2012 1:33:25 PM UTC+5:30, Chris Angelico wrote:
>>> On Sun, Jul 8, 2012 at 3:42 PM, <> wrote:
>>> > file_open=open("/python32/doc1.txt","r")
>>> Also, as has already been mentioned: keeping your data files in the
>>> Python binaries directory isn't usually a good idea. More common to
>>> keep them in the same directory as your script, which would mean that
>>> you don't need a path on it at all.

>> No file path! Amazing. I do not know I like to know one small example please.

>
> open("doc1.txt","r")
>
> Python will look for a file called doc1.txt in the directory you run
> the script from (which is often going to be the same directory as your
> .py program).
>
>> Btw, some earlier post said, line.split() to convert line into bag of words can
>> be done with power(), but I did not find it, if any one can help. I do close
>> files do not worry. New style I'd try.

>
> I don't know what power() function you're talking about, and can't
> find it in the previous posts; the nearest I can find is a post from
> Ranting Rick which says a lot of guff that you can ignore. (Rick is a
> professional troll. Occasionally he says something useful and
> courteous; more often it's one or the other, or neither.)
>

I believe the relevant quote is """especially the Python gods have
given you *power* over string objects""". If that's the case, he's not
referring to a method or a function called "power".

He did give the good warning about the problem there could be if the
original string contains "$", the character being used as the separator.

> As to the closing of files: There are a few narrow issues that make it
> worth using the 'with' statement, such as exceptions; mostly, it's
> just a good habit to get into. If you ignore it, your file will
> *usually* be closed fairly soon after you stop referencing it, but
> there's no guarantee. (Someone else will doubtless correct me if I'm
> wrong, but I'm pretty sure Python guarantees to properly flush and
> close on exit, but not necessarily before.)
>

 
Reply With Quote
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      07-08-2012
On Sat, 7 Jul 2012 22:42:13 -0700 (PDT),
declaimed the following in gmane.comp.python.general:

>
> Thanks for pointing out the mistakes. Your points are right. So I am trying to revise it,
>
> file_open=open("/python32/doc1.txt","r")
> for line in file_open:
> line_word=line.split()
> print (line_word)
>
> To store them the best way is to assign a blank list and append but is there any alternate
> method for huge data it becomes tough as the list becomes huge if any way variables may be assigned.
>

Well, first to copy from an earlier post (just so I can trim the
unneeded)...

> > > I like to store in some variable,so that I may print line of my choice and manipulate them at my choice.
> > > Is there any way out to this problem?


It is still not clear exactly what the task itself is supposed to
be.

After all, you are splitting the line into a LIST of words, and then
here state the goal is to "print line of" choice... The line and not the
list? There is no hint of what "manipulate them" involves.

If the files are of any size, I would not even attempt to store them
internally... I'd be more likely to run a preprocess phase which opens
the file in binary mode, (maybe reads it in chunks), and builds a list
of /offsets/ to the start of each line. To process any specific line
later would use seek() operations to the start of the line, followed by
a read operation of just the length to the next line.

Doing an mmap() of the file may event speed up the later processing,
as you wouldn't be using I/O seeks, but just asking for slices from the
mmap'd file. The OS would be responsible for making sure the file
contents were in memory.

This won't work if the manipulation requires making a line longer or
shorter. In that case, preprocessing would be writing the lines to a
simple BSD-DB style "database", in which the "line number" is the key;
an manipulation would work on records fetched by line number, and
written back.

If you also store a "process date" in the BSD-DB database, you could
match it to the last modified time of the source file and skip
reprocessing if the source has not changed.
--
Wulfraed Dennis Lee Bieber AF6VN
HTTP://wlfraed.home.netcom.com/

 
Reply With Quote
 
Chris Angelico
Guest
Posts: n/a
 
      07-08-2012
On Mon, Jul 9, 2012 at 4:17 AM, Roy Smith <> wrote:
> In article <mailman.1922.1341767824.4697.python->,
> Chris Angelico <> wrote:
>
>> open("doc1.txt","r")
>>
>> Python will look for a file called doc1.txt in the directory you run
>> the script from (which is often going to be the same directory as your
>> .py program).

>
> Well, to pick a nit, the file will be looked for in the current working
> directory. This may or may not be the directory you ran your script
> from. Your script could have executed chdir() between the time you
> started it and you tried to open the file.
>
> To pick another nit, it's misleading to say, "Python will look for...".
> This implies that Python somehow gets involved in pathname resolution,
> when it doesn't. Python just passes paths to the operating system as
> opaque strings, and the OS does all the magic of figuring out what that
> string means.


Two perfectly accurate nitpicks. And of course, there's a million and
one other things that could happen in between, too, including
possibilities of the current directory not even existing and so on. I
merely oversimplified in the hopes of giving a one-paragraph
explanation of what it means to not put a path name in your open()
call It's like the difference between reminder text on a Magic: The
Gathering card and the actual entries in the Comprehensive Rules.
Perfect example is the "Madness" ability - the reminder text explains
the ability, but uses language that actually is quite incorrect. It's
a better explanation, though.

Am I overanalyzing this? Yeah, probably...

ChrisA
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOC] Google Summer of Code Discussion (Deadline extended!!) Jeremy McAnally Ruby 0 04-02-2008 04:27 AM
code a discussion forum using java guessmyname Java 4 01-18-2006 07:01 PM
ANN: Web services security issues (August 10 panel discussion in San Diego) Ken North XML 0 08-05-2004 01:55 PM
SNMP Issues in Cisco Routers; Vulnerability Issues in TCP =?iso-8859-1?Q?Frisbee=AE?= MCSE 0 04-21-2004 03:00 PM
Can we have some serious discussion here !!! Prashant MCSE 11 12-31-2003 01:57 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57