Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Project organization and import

Reply
Thread Tools

Project organization and import

 
 
Martin Unsal
Guest
Posts: n/a
 
      03-05-2007
I'm using Python for what is becoming a sizeable project and I'm
already running into problems organizing code and importing packages.
I feel like the Python package system, in particular the isomorphism
between filesystem and namespace, doesn't seem very well suited for
big projects. However, I might not really understand the Pythonic way.
I'm not sure if I have a specific question here, just a general plea
for advice.

1) Namespace. Python wants my namespace heirarchy to match my
filesystem heirarchy. I find that a well organized filesystem
heirarchy for a nontrivial project will be totally unwieldy as a
namespace. I'm either forced to use long namespace prefixes, or I'm
forced to use "from foo import *" and __all__, which has its own set
of problems.

1a) Module/class collision. I like to use the primary class in a file
as the name of the file. However this can lead to namespace collisions
between the module name and the class name. Also it means that I'm
going to be stuck with the odious and wasteful syntax foo.foo
everywhere, or forced to use "from foo import *".

1b) The Pythonic way seems to be to put more stuff in one file, but I
believe this is categorically the wrong thing to do in large projects.
The moment you have more than one developer along with a revision
control system, you're going to want files to contain the smallest
practical functional blocks. I feel pretty confident saying that "put
more stuff in one file" is the wrong answer, even if it is the
Pythonic answer.

2) Importing and reloading. I want to be able to reload changes
without exiting the interpreter. This pretty much excludes "from foo
import *", unless you resort to this sort of hack:

http://www.python.org/search/hyperma...1993/0448.html

Has anyone found a systematic way to solve the problem of reloading in
an interactive interpreter when using "from foo import *"?


I appreciate any advice I can get from the community.

Martin

 
Reply With Quote
 
 
 
 
Jorge Godoy
Guest
Posts: n/a
 
      03-05-2007
"Martin Unsal" <(E-Mail Removed)> writes:

> 1) Namespace. Python wants my namespace heirarchy to match my filesystem
> heirarchy. I find that a well organized filesystem heirarchy for a
> nontrivial project will be totally unwieldy as a namespace. I'm either
> forced to use long namespace prefixes, or I'm forced to use "from foo import
> *" and __all__, which has its own set of problems.


I find it nice. You have the idea of where is something just from the import
and you don't have to search for it everywhere. Isn't, e.g., Java like that?
(It's been so long since I last worried with Java that I don't remember if
this is mandatory or just a convention...)

You might get bitten with that when moving files from one OS to another,
specially if one of them disconsider the case and the other is strict with
it.

> 1a) Module/class collision. I like to use the primary class in a file as the
> name of the file. However this can lead to namespace collisions between the
> module name and the class name. Also it means that I'm going to be stuck
> with the odious and wasteful syntax foo.foo everywhere, or forced to use
> "from foo import *".


Your classes should be CamelCased and start with an uppercase letter. So
you'd have foo.Foo, being "foo" the package and "Foo" the class inside of it.

> 1b) The Pythonic way seems to be to put more stuff in one file, but I
> believe this is categorically the wrong thing to do in large projects. The
> moment you have more than one developer along with a revision control
> system, you're going to want files to contain the smallest practical
> functional blocks. I feel pretty confident saying that "put more stuff in
> one file" is the wrong answer, even if it is the Pythonic answer.


Why? RCS systems can merge changes. A RCS system is not a substitute for
design or programmers communication. You'll only have a problem if two people
change the same line of code and if they are doing that (and worse: doing that
often) then you have a bigger problem than just the contents of the file.

Unit tests help being sure that one change doesn't break the project as a
whole and for a big project you're surely going to have a lot of those tests.

If one change breaks another, then there is a disagreement on the application
design and more communication is needed between developers or a better
documentation of the API they're implementing / using.

> 2) Importing and reloading. I want to be able to reload changes without
> exiting the interpreter. This pretty much excludes "from foo import *",
> unless you resort to this sort of hack:
>
> http://www.python.org/search/hyperma...1993/0448.html
>
> Has anyone found a systematic way to solve the problem of reloading in an
> interactive interpreter when using "from foo import *"?


I don't reload... When my investigative tests gets bigger I write a script
and run it with the interpreter. It is easy since my text editor can call
Python on a buffer (I use Emacs).

> I appreciate any advice I can get from the community.


This is just how I deal with it... My bigger "project" has several modules
now each with its own namespace and package. The API is very documented and
took the most work to get done.

Using setuptools, entrypoints, etc. helps a lot as well.


The thing is that for big projects your design is the most important part.
Get it right and you won't have problems with namespaces and filenames. If
you don't dedicate enough time on this task you'll find yourself in trouble
really soon.

--
Jorge Godoy <(E-Mail Removed)>
 
Reply With Quote
 
 
 
 
bruno.desthuilliers@gmail.com
Guest
Posts: n/a
 
      03-05-2007
On 5 mar, 01:21, "Martin Unsal" <(E-Mail Removed)> wrote:
> I'm using Python for what is becoming a sizeable project and I'm
> already running into problems organizing code and importing packages.
> I feel like the Python package system, in particular the isomorphism
> between filesystem and namespace,


It's not necessarily a 1:1 mapping. Remember that you can put code in
the __init__.py of a package, and that this code can import sub-
packages/modules namespaces, making the package internal organisation
transparent to user code (I've quite often started with a simple
module, latter turning it into a package as the source-code was
growing too big).

> doesn't seem very well suited for
> big projects. However, I might not really understand the Pythonic way.


cf above.

> I'm not sure if I have a specific question here, just a general plea
> for advice.
>
> 1) Namespace. Python wants my namespace heirarchy to match my
> filesystem heirarchy. I find that a well organized filesystem
> heirarchy for a nontrivial project will be totally unwieldy as a
> namespace. I'm either forced to use long namespace prefixes, or I'm
> forced to use "from foo import *" and __all__, which has its own set
> of problems.


cf above. Also remember that you can "import as", ie:

import some_package.some_subpackage.some_module as some_module

> 1a) Module/class collision. I like to use the primary class in a file
> as the name of the file.


Bad form IMHO. Packages and module names should be all_lower,
classnames CamelCased.

>
> 1b) The Pythonic way seems to be to put more stuff in one file,


Pythonic way is to group together highly related stuff. Not to "put
more stuff".

> but I
> believe this is categorically the wrong thing to do in large projects.


Oh yes ? Why ?

> The moment you have more than one developer along with a revision
> control system,


You *always* have a revision system, don't you ? And having more than
one developper on a project - be it big or small - is quite common.

> you're going to want files to contain the smallest
> practical functional blocks. I feel pretty confident saying that "put
> more stuff in one file" is the wrong answer, even if it is the
> Pythonic answer.


Is this actually based on working experience ? It seems that there are
enough not-trivial Python projects around to prove that it works just
fine.


 
Reply With Quote
 
Martin Unsal
Guest
Posts: n/a
 
      03-05-2007
On Mar 5, 12:45 am, "(E-Mail Removed)"
<(E-Mail Removed)> wrote:
> Remember that you can put code in
> the __init__.py of a package, and that this code can import sub-
> packages/modules namespaces, making the package internal organisation
> transparent to user code


Sure, but that doesn't solve the problem.

Say you have a package "widgets" with classes ScrollBar, Form, etc.
You want the end user to "import widgets" and then invoke
"widgets.ScrollBar()". As far as I know there are only two ways to do
this, both seriously flawed: 1) Put all your code in one module
widgets.py, 2) use "from scrollbar import *" in widgets/__init__.py,
which is semi-deprecated and breaks reload().

> Also remember that you can "import as", ie:
>
> import some_package.some_subpackage.some_module as some_module


Sure but that doesn't eliminate the unfortunate interaction between
Python class organization and filesystem heirarchy. For example, say
you want to organize the widgets package as follows:

widgets/scrollbar/*.py
widgets/form/*.py
widgets/common/util.py

Other than messing around with PYTHONPATH, which is horrible, I don't
see how to import util.py from the widget code.

> Bad form IMHO. Packages and module names should be all_lower,
> classnames CamelCased.


You're still stuck doing foo.Foo() everywhere in your client code,
which is ugly and wastes space, or using "from foo import *" which is
broken.

> > but I
> > believe this is categorically the wrong thing to do in large projects.

>
> Oh yes ? Why ?


For myriad reasons, just one of them being the one I stated -- smaller
files with one functional unit each are more amenable to source code
management with multiple developers.

We could discuss this till we're blue in the face but it's beside the
point. For any given project, architecture, and workflow, the
developers are going to have a preference for how to organize the code
structurally into files, directories, packages, etc. The language
itself should not place constraints on them. The mere fact that it is
supposedly "Pythonic" to put more functionality in one file indicates
to me that the Python package system is obstructing some of its users
who have perfectly good reasons to organize their code differently.

> > you're going to want files to contain the smallest
> > practical functional blocks. I feel pretty confident saying that "put
> > more stuff in one file" is the wrong answer, even if it is the
> > Pythonic answer.

>
> Is this actually based on working experience ? It seems that there are
> enough not-trivial Python projects around to prove that it works just
> fine.


Yes. I've worked extensively on several projects in several languages
with multi-million lines of code and they invariably have coding
styles that recommend one functional unit (such as a class), or at
most a few closely related functional units per file.

In Python, most of the large projects I've looked at use "from foo
import *" liberally.

I guess my question boils down to this. Is "from foo import *" really
deprecated or not? If everyone has to use "from foo import *" despite
the problems it causes, how do they work around those problems (such
as reloading)?

Martin

 
Reply With Quote
 
Martin Unsal
Guest
Posts: n/a
 
      03-05-2007
Jorge, thanks for your response. I replied earlier but I think my
response got lost. I'm trying again.

On Mar 4, 5:20 pm, Jorge Godoy <(E-Mail Removed)> wrote:
> Why? RCS systems can merge changes. A RCS system is not a substitute for
> design or programmers communication.


Text merges are an error-prone process. They can't be eliminated but
they are best avoided when possible.

When refactoring, it's much better to move small files around than to
move chunks of code between large files. In the former case your SCM
system can track integration history, which is a big win.

> Unit tests help being sure that one change doesn't break the project as a
> whole and for a big project you're surely going to have a lot of those tests.


But unit tests are never an excuse for error prone workflow. "Oh,
don't worry, we'll catch that with unit tests" is never something you
want to say or hear.

> I don't reload... When my investigative tests gets bigger I write a script
> and run it with the interpreter. It is easy since my text editor can call
> Python on a buffer (I use Emacs).


That's interesting, is this workflow pretty universal in the Python
world?

I guess that seems unfortunate to me, one of the big wins for
interpreted languages is to make the development cycle as short and
interactive as possible. As I see it, the Python way should be to
reload a file and reinvoke the class directly, not to restart the
interpreter, load an entire package and then run a test script to set
up your test conditions again.

Martin

 
Reply With Quote
 
Chris Mellon
Guest
Posts: n/a
 
      03-05-2007
On 5 Mar 2007 08:32:34 -0800, Martin Unsal <(E-Mail Removed)> wrote:
> Jorge, thanks for your response. I replied earlier but I think my
> response got lost. I'm trying again.
>
> On Mar 4, 5:20 pm, Jorge Godoy <(E-Mail Removed)> wrote:
> > Why? RCS systems can merge changes. A RCS system is not a substitute for
> > design or programmers communication.

>
> Text merges are an error-prone process. They can't be eliminated but
> they are best avoided when possible.
>
> When refactoring, it's much better to move small files around than to
> move chunks of code between large files. In the former case your SCM
> system can track integration history, which is a big win.
>
> > Unit tests help being sure that one change doesn't break the project as a
> > whole and for a big project you're surely going to have a lot of those tests.

>
> But unit tests are never an excuse for error prone workflow. "Oh,
> don't worry, we'll catch that with unit tests" is never something you
> want to say or hear.
>


That's actually the exact benefit of unit testing, but I don't feel
that you've actually made a case that this workflow is error prone.
You often have multiple developers working on the same parts of the
same module?

> > I don't reload... When my investigative tests gets bigger I write a script
> > and run it with the interpreter. It is easy since my text editor can call
> > Python on a buffer (I use Emacs).

>
> That's interesting, is this workflow pretty universal in the Python
> world?
>
> I guess that seems unfortunate to me, one of the big wins for
> interpreted languages is to make the development cycle as short and
> interactive as possible. As I see it, the Python way should be to
> reload a file and reinvoke the class directly, not to restart the
> interpreter, load an entire package and then run a test script to set
> up your test conditions again.


If you don't do this, you aren't really testing your changes, you're
testing your reload() machinery. You seem to have a lot of views about
what the "Python way" should be and those are at odds with the actual
way people work with Python. I'm not (necessarily) saying you're
wrong, but you seem to be coming at this from a confrontational
standpoint.

Your claim, for example, that the language shouldn't place constraints
on how you manage your modules is questionable. I think it's more
likely that you've developed a workflow based around the constraints
(and abilities) of other languages and you're now expecting Python to
conform to that instead of its own.

I've copied some of your responses from your earlier post below:

>Yes. I've worked extensively on several projects in several languages
>with multi-million lines of code and they invariably have coding
>styles that recommend one functional unit (such as a class), or at
>most a few closely related functional units per file.


I wonder if you've ever asked yourself why this is the case. I know
from my own experience why it's done in traditional C++/C environments
- it's because compiling is slow and breaking things into as many
files (with as few interdependencies) as possible speeds up the
compilation process. Absent this need (which doesn't exist in Python),
what benefit is there to separating out related functionality into
multiple files? Don't split them up just because you've done so in the
past - know why you did it in the past and if those conditions still
apply. Don't split them up until it makes sense for *this* project,
not the one you did last year or 10 years ago.

>I guess my question boils down to this. Is "from foo import *" really
>deprecated or not? If everyone has to use "from foo import *" despite
>the problems it causes, how do they work around those problems (such
>as reloading)?


from foo import * is a bad idea at a top level because it pollutes
your local namespace. In a package __init__, which exists expressly
for the purpose of exposing it's interior namespaces as a single flat
one, it makes perfect sense. In some cases you don't want to export
everything, which is when __all__ starts to make sense. Clients of a
package (or a module) shouldn't use from foo import * without a good
reason. Nobody I know uses reload() for anything more than trivial "as
you work" testing in the interpreter. It's not reliable or recommended
for anything other than that. It's not hard to restart a shell,
especially if you use ipython (which can save and re-create a session)
or a script thats set up to create your testing environment. This is
still a much faster way than compiling any but the most trivial of
C/C++ modules. In fact, on my system startup time for the interpreter
is roughly the same as the "startup time" of my compiler (that is to
say, the amount of time it takes deciding what its going to compile,
without actually compiling anything).

>You're still stuck doing foo.Foo() everywhere in your client code,
>which is ugly and wastes space, or using "from foo import *" which is
>broken.


If you don't like working with explicit namespaces, you've probably
chosen the wrong language. If you have a specific name (or a few
names) which you use all the time from a module, then you can import
just those names into your local namespace to save on typing. You can
also alias deeply nested names to something more shallow.

>For myriad reasons, just one of them being the one I stated -- smaller
>files with one functional unit each are more amenable to source code
>management with multiple developers.


I propose that the technique most amenable to source code management
is for a single file (or RCS level module, if you have a locking RCS)
to have everything that it makes sense to edit or change for a
specific feature. This is an impossible goal in practice (because you
will inevitably and necessarily have intermodule dependencies) but
your developers don't write code based around individual files. They
base it around the systems and the interfaces that compose your
project. It makes no more sense to arbitrarily break them into
multiple files than it does to arbitrarily leave them all in a single
file.

In summary: I think you've bound yourself to a style of source
management that made sense in the past without reanalyzing it to see
if it makes sense now. Trust your judgment and that of your developers
when it comes to modularization. When they end up needing to merge all
the time because they're conflicting with someone else's work, they'll
break things up into modules.

You're also placing far too much emphasis on reload. Focus yourself on
unit tests and environment scripts instead. These are more reliable
and easier to validate than reload() in a shell.
 
Reply With Quote
 
Martin Unsal
Guest
Posts: n/a
 
      03-05-2007
On Mar 5, 9:15 am, "Chris Mellon" <(E-Mail Removed)> wrote:
> That's actually the exact benefit of unit testing, but I don't feel
> that you've actually made a case that this workflow is error prone.
> You often have multiple developers working on the same parts of the
> same module?


Protecting your head is the exact benefit of bike helmets, that
doesn't mean you should bike more more recklessly just because you're
wearing a helmet.

Doing text merges is more error prone than not doing them.

There are myriad other benefits of breaking up large files into
functional units. Integration history, refactoring, reuse, as I
mentioned. Better clarity of design. Easier communication and
coordination within a team. What's the down side? What's the advantage
of big files with many functional units?

> If you don't do this, you aren't really testing your changes, you're
> testing your reload() machinery.


Only because reload() is hard in Python!

> You seem to have a lot of views about
> what the "Python way" should be and those are at odds with the actual
> way people work with Python. I'm not (necessarily) saying you're
> wrong, but you seem to be coming at this from a confrontational
> standpoint.


When I refer to "Pythonic" all I'm talking about is what I've read
here and observed in other people's code. I'm here looking for more
information about how other people work, to see if there are good
solutions to the problems I see.

However when I talk about what I think is "wrong" with the Pythonic
way, obviously that's just my opinion formed by my own experience.

> Your claim, for example, that the language shouldn't place constraints
> on how you manage your modules is questionable. I think it's more
> likely that you've developed a workflow based around the constraints
> (and abilities) of other languages and you're now expecting Python to
> conform to that instead of its own.


I don't think so; I'm observing things that are common to several
projects in several languages.

> I wonder if you've ever asked yourself why this is the case. I know
> from my own experience why it's done in traditional C++/C environments
> - it's because compiling is slow and breaking things into as many
> files (with as few interdependencies) as possible speeds up the
> compilation process.


I don't think that's actually true. Fewer, bigger compilation units
actually compile faster in C, at least in my experience.

> Absent this need (which doesn't exist in Python),


Python still takes time to load & "precompile". That time is becoming
significant for me even in a modest sized project; I imagine it would
be pretty awful in a multimillion line project.

No matter how fast it is, I'd rather reload one module than exit my
interpreter and reload the entire world.

This is not a problem for Python as scripting language. This is a real
problem for Python as world class application development language.

> In a package __init__, which exists expressly
> for the purpose of exposing it's interior namespaces as a single flat
> one, it makes perfect sense.


OK! That's good info, thanks.

> Nobody I know uses reload() for anything more than trivial "as
> you work" testing in the interpreter. It's not reliable or recommended
> for anything other than that.


That too... although I think that's unfortunate. If reload() were
reliable, would you use it? Do you think it's inherently unreliable,
that is, it couldn't be fixed without fundamentally breaking the
Python language core?

> This is
> still a much faster way than compiling any but the most trivial of
> C/C++ modules.


I'm with you there! I love Python and I'd never go back to C/C++. That
doesn't change my opinion that Python's import mechanism is an
impediment to developing large projects in the language.

> If you don't like working with explicit namespaces, you've probably
> chosen the wrong language.


I never said that. I like foo.Bar(), I just don't like typing
foo.Foo() and bar.Bar(), which is a waste of space; syntax without
semantics.

> I propose that the technique most amenable to source code management
> is for a single file (or RCS level module, if you have a locking RCS)
> to have everything that it makes sense to edit or change for a
> specific feature.


Oh, I agree completely. I think we're using the exact same criterion.
A class is a self-contained feature with a well defined interface,
just what you'd want to put in it's own file. (Obviously there are
trivial classes which don't implement features, and they don't need
their own files.)

> You're also placing far too much emphasis on reload. Focus yourself on
> unit tests and environment scripts instead. These are more reliable
> and easier to validate than reload() in a shell.


I think this is the crux of my frustration. I think reload() is
unreliable and hard to validate because Python's package management is
broken. I appreciate your suggestion of alternatives and I think I
need to come to terms with the fact that reload() is just broken. That
doesn't mean it has to be that way or that Python is blameless in this
problem.

Martin

 
Reply With Quote
 
Chris Mellon
Guest
Posts: n/a
 
      03-05-2007
On 5 Mar 2007 10:31:33 -0800, Martin Unsal <(E-Mail Removed)> wrote:
> On Mar 5, 9:15 am, "Chris Mellon" <(E-Mail Removed)> wrote:
> > That's actually the exact benefit of unit testing, but I don't feel
> > that you've actually made a case that this workflow is error prone.
> > You often have multiple developers working on the same parts of the
> > same module?

>
> Protecting your head is the exact benefit of bike helmets, that
> doesn't mean you should bike more more recklessly just because you're
> wearing a helmet.
>
> Doing text merges is more error prone than not doing them.
>
> There are myriad other benefits of breaking up large files into
> functional units. Integration history, refactoring, reuse, as I
> mentioned. Better clarity of design. Easier communication and
> coordination within a team. What's the down side? What's the advantage
> of big files with many functional units?
>



I never advocated big files with many functional units - just files
that are "just big enough". You'll know you've broken them down small
enough when you stop having to do text merges every time you commit.

> > If you don't do this, you aren't really testing your changes, you're
> > testing your reload() machinery.

>
> Only because reload() is hard in Python!
>
> > You seem to have a lot of views about
> > what the "Python way" should be and those are at odds with the actual
> > way people work with Python. I'm not (necessarily) saying you're
> > wrong, but you seem to be coming at this from a confrontational
> > standpoint.

>
> When I refer to "Pythonic" all I'm talking about is what I've read
> here and observed in other people's code. I'm here looking for more
> information about how other people work, to see if there are good
> solutions to the problems I see.
>
> However when I talk about what I think is "wrong" with the Pythonic
> way, obviously that's just my opinion formed by my own experience.
>
> > Your claim, for example, that the language shouldn't place constraints
> > on how you manage your modules is questionable. I think it's more
> > likely that you've developed a workflow based around the constraints
> > (and abilities) of other languages and you're now expecting Python to
> > conform to that instead of its own.

>
> I don't think so; I'm observing things that are common to several
> projects in several languages.
>


..... languages with similar runtime semantics and perhaps common
ancestry? All languages place limitations on how you handle modules,
either because they have infrastructure you need to use or because
they lack it and you're left on your own.

> > I wonder if you've ever asked yourself why this is the case. I know
> > from my own experience why it's done in traditional C++/C environments
> > - it's because compiling is slow and breaking things into as many
> > files (with as few interdependencies) as possible speeds up the
> > compilation process.

>
> I don't think that's actually true. Fewer, bigger compilation units
> actually compile faster in C, at least in my experience.
>


If you're doing whole project compilation. When you're working,
though, you want to be able to do incremental compilation (all modern
compilers I know of support this) so you just recompile the files
you've changed (and dependencies) and relink. Support for this is why
we have stuff like precompiled headers, shadow headers like Qt uses,
and why C++ project management advocates single class-per-file
structures. Fewer dependencies between compilation units means a
faster rebuild-test turnaround.

> > Absent this need (which doesn't exist in Python),

>
> Python still takes time to load & "precompile". That time is becoming
> significant for me even in a modest sized project; I imagine it would
> be pretty awful in a multimillion line project.
>
> No matter how fast it is, I'd rather reload one module than exit my
> interpreter and reload the entire world.
>


Sure, but whats your goal here? If you're just testing something as
you work, then this works fine. If you're testing large changes, that
affect many modules, then you *need* to reload your world, because you
want to make sure that what you're testing is clean. I think this
might be related to your desire to have everything in lots of little
files. The more modules you load, the harder it is to track your
dependencies and make sure that the reload is correct.

> This is not a problem for Python as scripting language. This is a real
> problem for Python as world class application development language.
>


Considering that no other "world class application development
language" supports reload even as well as Python does, I'm not sure I
can agree here. A perfect reload might be a nice thing to have, but
lack of it hardly tosses Python (or any language) out of the running.

> > In a package __init__, which exists expressly
> > for the purpose of exposing it's interior namespaces as a single flat
> > one, it makes perfect sense.

>
> OK! That's good info, thanks.
>
> > Nobody I know uses reload() for anything more than trivial "as
> > you work" testing in the interpreter. It's not reliable or recommended
> > for anything other than that.

>
> That too... although I think that's unfortunate. If reload() were
> reliable, would you use it? Do you think it's inherently unreliable,
> that is, it couldn't be fixed without fundamentally breaking the
> Python language core?
>


The semantics of exactly what reload should do are tricky. Pythons
reload works in a sensible but limited way. More complicated reloads
are generally considered more trouble than they are worth. I've wanted
different things from reload() at different times, so I'm not even
sure what I would consider it being "reliable".

Here's a trivial example - if you rename a class in a module and then
reload it, what should happen to instances of the class you renamed?

> > This is
> > still a much faster way than compiling any but the most trivial of
> > C/C++ modules.

>
> I'm with you there! I love Python and I'd never go back to C/C++. That
> doesn't change my opinion that Python's import mechanism is an
> impediment to developing large projects in the language.
>
> > If you don't like working with explicit namespaces, you've probably
> > chosen the wrong language.

>
> I never said that. I like foo.Bar(), I just don't like typing
> foo.Foo() and bar.Bar(), which is a waste of space; syntax without
> semantics.
>


There's nothing that prevents there being a bar.Foo, the namespace
makes it clear where you're getting the object. This is again a
consequence of treating modules like classes. Some modules only expose
a single class (StringIO/cStringIO in the standardlib is a good
example), but it's more common for them to expose a single set of
"functionality".

That said, nothing prevents you from using "from foo import Foo" if
Foo is all you need (or need most - you can combine this with import
foo).

> > I propose that the technique most amenable to source code management
> > is for a single file (or RCS level module, if you have a locking RCS)
> > to have everything that it makes sense to edit or change for a
> > specific feature.

>
> Oh, I agree completely. I think we're using the exact same criterion.
> A class is a self-contained feature with a well defined interface,
> just what you'd want to put in it's own file. (Obviously there are
> trivial classes which don't implement features, and they don't need
> their own files.)
>


Sure, if all your classes are that. But very few classes exist in
isolation - there's external and internal dependencies, and some
classes are tightly bound. There's no reason for these tightly bound
classes to be in external files (or an external namespace), because
when you work on one you'll need to work on them all.

> > You're also placing far too much emphasis on reload. Focus yourself on
> > unit tests and environment scripts instead. These are more reliable
> > and easier to validate than reload() in a shell.

>
> I think this is the crux of my frustration. I think reload() is
> unreliable and hard to validate because Python's package management is
> broken. I appreciate your suggestion of alternatives and I think I
> need to come to terms with the fact that reload() is just broken. That
> doesn't mean it has to be that way or that Python is blameless in this
> problem.
>


I wonder what environments you worked in before that actually had a
reliable and gotcha free version of reload? I actually don't know of
any - Smalltalk is closest. It's not really "broken" when you
understand what it does. There's just an expectation that it does
something else, and when it doesn't meet that expectation it's assumed
to be broken. Now, thats a fair definition of "broken", but replacing
running instances in a live image is a very hard problem to solve
generally. Limiting reload() to straightforward, reliable behavior is
a reasonable design decision.
 
Reply With Quote
 
Dave Baum
Guest
Posts: n/a
 
      03-05-2007
In article <(E-Mail Removed) .com>,
"Martin Unsal" <(E-Mail Removed)> wrote:

> That too... although I think that's unfortunate. If reload() were
> reliable, would you use it? Do you think it's inherently unreliable,
> that is, it couldn't be fixed without fundamentally breaking the
> Python language core?


I wrote a module that wraps __import__ and tracks the dependencies of
imports. It then allows you to unload any modules whose source have
changed. That seemed to work out nicely for multi-module projects.

However, one problem I ran into was that dynamic libraries don't get
reloaded, so if you are doing hybrid C++/Python development then this
doesn't help - you still have to restart the whole python process to
pick up changes in your C++ code.

I also didn't do a ton of testing. It worked for a few small projects
I was working on, but I stopped using it once I ran into the dynamic
library thing, and at this point I'm used to just restarting python
each time. I'm sure there are some odd things that some python modules
could do that would interfere with the automatic reloading code I
wrote.

If you're interested in the code, drop me an email.

Dave
 
Reply With Quote
 
Bruno Desthuilliers
Guest
Posts: n/a
 
      03-05-2007
Martin Unsal a écrit :
> On Mar 5, 12:45 am, "(E-Mail Removed)"
> <(E-Mail Removed)> wrote:
>
>>Remember that you can put code in
>>the __init__.py of a package, and that this code can import sub-
>>packages/modules namespaces, making the package internal organisation
>>transparent to user code

>
>
> Sure, but that doesn't solve the problem.
>
> Say you have a package "widgets" with classes ScrollBar, Form, etc.
> You want the end user to "import widgets" and then invoke
> "widgets.ScrollBar()". As far as I know there are only two ways to do
> this, both seriously flawed: 1) Put all your code in one module
> widgets.py, 2) use "from scrollbar import *" in widgets/__init__.py,
> which is semi-deprecated


"deprecated" ? Didn't see any mention of this so far. But it's bad form,
since it makes hard to know where some symbol comes from.

# widgets.__init
from scrollbar import Scrollbar, SomeOtherStuff, some_function, SOME_CONST

> and breaks reload().


>
>>Also remember that you can "import as", ie:
>>
>>import some_package.some_subpackage.some_module as some_module

>
>
> Sure but that doesn't eliminate the unfortunate interaction between
> Python class organization and filesystem heirarchy.


*class* organization ? It's not Java here. Nothing forces you to use
classes.

> For example, say
> you want to organize the widgets package as follows:
>
> widgets/scrollbar/*.py
> widgets/form/*.py
> widgets/common/util.py
>
> Other than messing around with PYTHONPATH, which is horrible, I don't
> see how to import util.py from the widget code.


Some of us still manage to do so without messing with PYTHONPATH.

>
>>Bad form IMHO. Packages and module names should be all_lower,
>>classnames CamelCased.

>
>
> You're still stuck doing foo.Foo() everywhere in your client code,


from foo import Foo

But:
> which is ugly


It's not ugly, it's informative. At least you know where Foo comes from.

> and wastes space,


My. Three letters and a dot...

> or using "from foo import *" which is
> broken.


cf above.

>
>>>but I
>>>believe this is categorically the wrong thing to do in large projects.

>>
>>Oh yes ? Why ?

>
>
> For myriad reasons, just one of them being the one I stated -- smaller
> files with one functional unit each


Oh. So you're proposing that each and any single function goes in a
separate file ?

> are more amenable to source code
> management with multiple developers.


This is not my experience.

> We could discuss this till we're blue in the face but it's beside the
> point. For any given project, architecture, and workflow, the
> developers are going to have a preference for how to organize the code
> structurally into files, directories, packages, etc. The language
> itself should not place constraints on them. The mere fact that it is
> supposedly "Pythonic" to put more functionality in one file indicates
> to me that the Python package system is obstructing some of its users
> who have perfectly good reasons to organize their code differently.


It has never been an issue for me so far.

>
>>>you're going to want files to contain the smallest
>>>practical functional blocks. I feel pretty confident saying that "put
>>>more stuff in one file" is the wrong answer, even if it is the
>>>Pythonic answer.

>>
>>Is this actually based on working experience ? It seems that there are
>>enough not-trivial Python projects around to prove that it works just
>>fine.

>
>
> Yes. I've worked extensively on several projects in several languages
> with multi-million lines of code


I meant, based on working experience *with Python* ? I've still not seen
a "multi-million" KLOC project in Python - unless of course you include
all the stdlib and the interpreter itself, and even then I doubt we get
so far.

> and they invariably have coding
> styles that recommend one functional unit (such as a class), or at
> most a few closely related functional units per file.


Which is what I see in most Python packages I've seen so far. But we may
not have the same definition for "a few" and "closely related" ?

> In Python, most of the large projects I've looked at use "from foo
> import *" liberally.


I've seen few projects using this. And I wouldn't like having to
maintain such a project.

> I guess my question boils down to this. Is "from foo import *" really
> deprecated or not?


This syntax is only supposed to be a handy shortcut for quick testing
and exploration in an interactive session. Using it in production code
is considered bad form.

> If everyone has to use "from foo import *"


I never did in 7 years.

> despite
> the problems it causes, how do they work around those problems (such
> as reloading)?


Do you often have a need for "reloading" in production code ???

Martin, I'm not saying Python is perfect, but it really feels like
you're worrying about things that are not problems.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Netbeans File Organization - Web Project LB Java 5 07-25-2008 05:55 PM
tidy project file organization (modules and tests) bramble Python 0 11-22-2007 08:53 PM
Project organization and import redux Hamilton, William Python 0 04-05-2007 03:42 PM
web project organization in eclipse noemail12000@yahoo.com Java 0 08-09-2006 03:50 PM
C++ Project Organization Guidelines Steven T. Hatton C++ 1 07-26-2005 10:54 AM



Advertisments