Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > automatically generating file dependency information from python tools

Reply
Thread Tools

automatically generating file dependency information from python tools

 
 
Moosebumps
Guest
Posts: n/a
 
      04-09-2004
Say you have a group of 20 programmers, and they're all writing python
scripts that are simple data crunchers -- i.e. command line tools that read
from one or more files and output one or more files.

I want to set up some sort of system that would automatically generate
makefile type information from the source code of these tools. Can anyone
think of a good way of doing it? You could make everyone call a special
function that wraps the file() and detects whether they are opening the file
for read or write. If read, it's an input, if write, it's an output file
(assume there is no r/w access). Then I guess your special function would
output the info in some sort of repository, which collects such info from
all the individual data crunchers.

The other thing I could think of is statically analyzing the source code --
but what if the filenames are generated dynamically? I'd be interested in
any ideas or links on this, I just started thinking about it today. For
some reason it seems to be a sort of problem to solve with metaclasses --
but I haven't thought of exactly how.

thanks,
MB


 
Reply With Quote
 
 
 
 
Jack Diederich
Guest
Posts: n/a
 
      04-09-2004
On Fri, Apr 09, 2004 at 10:16:39PM +0000, Moosebumps wrote:
> Say you have a group of 20 programmers, and they're all writing python
> scripts that are simple data crunchers -- i.e. command line tools that read
> from one or more files and output one or more files.
>
> I want to set up some sort of system that would automatically generate
> makefile type information from the source code of these tools. Can anyone
> think of a good way of doing it? You could make everyone call a special
> function that wraps the file() and detects whether they are opening the file
> for read or write. If read, it's an input, if write, it's an output file
> (assume there is no r/w access). Then I guess your special function would
> output the info in some sort of repository, which collects such info from
> all the individual data crunchers.
>
> The other thing I could think of is statically analyzing the source code --
> but what if the filenames are generated dynamically? I'd be interested in
> any ideas or links on this, I just started thinking about it today. For
> some reason it seems to be a sort of problem to solve with metaclasses --
> but I haven't thought of exactly how.
>


In answer to the question you /almost/ asked:

http://www.google.com/search?q=python+make+replacement

 
Reply With Quote
 
 
 
 
John Roth
Guest
Posts: n/a
 
      04-09-2004

"Moosebumps" <(E-Mail Removed)> wrote in message
news:bhFdc.50011$(E-Mail Removed). com...
> Say you have a group of 20 programmers, and they're all writing python
> scripts that are simple data crunchers -- i.e. command line tools that

read
> from one or more files and output one or more files.
>
> I want to set up some sort of system that would automatically generate
> makefile type information from the source code of these tools. Can anyone
> think of a good way of doing it? You could make everyone call a special
> function that wraps the file() and detects whether they are opening the

file
> for read or write. If read, it's an input, if write, it's an output file
> (assume there is no r/w access). Then I guess your special function would
> output the info in some sort of repository, which collects such info from
> all the individual data crunchers.
>
> The other thing I could think of is statically analyzing the source

code --
> but what if the filenames are generated dynamically? I'd be interested in
> any ideas or links on this, I just started thinking about it today. For
> some reason it seems to be a sort of problem to solve with metaclasses --
> but I haven't thought of exactly how.


I'm not entirely clear on what the purpose of this is. I normally
think of "makefile" type information as something needed to compile
a program. This is something that isn't usually needed for Python
unless you're dealing with C extensions. Then I'd suggest looking at
SCons (www.scons.org).

What I'm getting is that you want to tie the individual programs
to the files that they're processing. In other words, build a catalog
of "if you have this kind of file, these are the availible programs that
will process it."

So the basic question is: are the files coming in from the command
line or are they built in? If the latter, I'd probably start out by pulling
strings that have a "." or a "/" or a "\" in them, and examining the
context. Or look at calls to modules from the os.path library.

More than likely you'll find a number of patterns that can be
processed and that will deal with the majority of programs. The
thing is, if you've got a bunch of programmers doing that kind
of work, they've probably fallen into habitual ways of coding
the repetitive stuff.

HTH

John Roth


>
> thanks,
> MB
>
>



 
Reply With Quote
 
Moosebumps
Guest
Posts: n/a
 
      04-10-2004
> I'm not entirely clear on what the purpose of this is. I normally
> think of "makefile" type information as something needed to compile
> a program. This is something that isn't usually needed for Python
> unless you're dealing with C extensions. Then I'd suggest looking at
> SCons (www.scons.org).


Well sorry for being so abstract, let me be a little more concrete. I am
working at a video game company, and I have had some success using Python
for tools. I am just thinking about ways to convince other people to use
it. One way would be to improve the build processes, and be able to do
incremental builds of art assets without any additional effort from
programmers. Basically I'm trying to find a way to do some work for free
with python.

The idea is that there are many different types of assets, e.g. 3D models,
textures/other images, animations, audio, spreadsheet data, etc. Each of
these generally has some tool that converts it from the source format to the
format that is stored in the game on disk / in memory. Hence they are
usually simple command line data crunchers. They take some files as input
and just produce other files as output.

Currently, we don't have time to generate the dependency information
necessary for incremental building, so we generally just build everything
over again from scratch, which takes 20 PCs the entire night. The problem
is that the pipeline changes frequently, and nothing is really documented,
especially the dependencies. It would be nice if there was a way to
automatically get these from the individual data crunchers, which may be
written by many different people. It eliminates the redundancy of having
dependency information in the source code of the individual tools, and also
in a separate file that specifies dependency info (like a makefile).

So instead rebuilding the whole game, or having to know exactly which files
to rebuild (which some people know, but many others don't), the "make" tool
would be able to read the dependency information generated, and check dates
on the source files to see what changes, and build the minimum number of
things to get the game up to date. Currently lots of unnecessary things are
rebuilt constantly.

> What I'm getting is that you want to tie the individual programs
> to the files that they're processing. In other words, build a catalog
> of "if you have this kind of file, these are the availible programs that
> will process it."


Well, that is not exactly the point, but hopefully that information would
fall out of the automatic processing of the individual command line tools.

> So the basic question is: are the files coming in from the command
> line or are they built in? If the latter, I'd probably start out by

pulling
> strings that have a "." or a "/" or a "\" in them, and examining the
> context. Or look at calls to modules from the os.path library.


They could be either "statically" specified in the source code, or only
known at runtime.

> More than likely you'll find a number of patterns that can be
> processed and that will deal with the majority of programs. The
> thing is, if you've got a bunch of programmers doing that kind
> of work, they've probably fallen into habitual ways of coding
> the repetitive stuff.


Yes, that is true, and everything works OK now, but there are thousands and
thousands of lines of redundant code, and the build process is very slow.
I'm just trying to separate out the common parts of every tool, rather than
having all that information duplicated in dozens of little command line
utilities.

MB


 
Reply With Quote
 
Moosebumps
Guest
Posts: n/a
 
      04-10-2004
>
> In answer to the question you /almost/ asked:
>
> http://www.google.com/search?q=python+make+replacement
>


That is definitely of interest to me, but I would want to go one step
further and automatically generate the dependency info. I haven't looked
specifically at these make replacements, but I would assume you have to use
a makefile or specify dependency info in some form like a text file. What I
am looking for is a way to automatically generate it from the source code of
the individual tools that the make program will run, or by running the tools
in some special mode where they just spit out which files they will
read/write.

MB


 
Reply With Quote
 
Peter Hansen
Guest
Posts: n/a
 
      04-10-2004
Moosebumps wrote:

> Say you have a group of 20 programmers, and they're all writing python
> scripts that are simple data crunchers -- i.e. command line tools that read
> from one or more files and output one or more files.


Shall we read into this the implication that there is no
coding standard of any kind being used for these tools? So
no hope of saying something as simple as "use constants for
all filenames, using the following conventions..."?

> I want to set up some sort of system that would automatically generate
> makefile type information from the source code of these tools. Can anyone
> think of a good way of doing it? You could make everyone call a special
> function that wraps the file() and detects whether they are opening the file
> for read or write.


I think you've mixed up your two ideas in the above. You don't really
mean "source code" here, do you? You mean catching the information
dynamically from the running program, I think. That is something
that is probably quite easy to do with Python. For example, just
have everyone import a particular magic module that you create for
this purpose at the top of their scripts. That module installs a
replacement open() (or file()) function in the builtins module, and
then any file that is opened for reading or writing can be noticed
and relevant notes about it recorded in your repository.

> The other thing I could think of is statically analyzing the source code --
> but what if the filenames are generated dynamically?


As you've guessed, much harder to do. Especially with a language
that is not statically typed... (dare I say?

-Peter
 
Reply With Quote
 
Steven Knight
Guest
Posts: n/a
 
      04-10-2004
> > I'm not entirely clear on what the purpose of this is. I normally
> > think of "makefile" type information as something needed to compile
> > a program. This is something that isn't usually needed for Python
> > unless you're dealing with C extensions. Then I'd suggest looking at
> > SCons (www.scons.org).

>
> Well sorry for being so abstract, let me be a little more concrete. I am
> working at a video game company, and I have had some success using Python
> for tools. I am just thinking about ways to convince other people to use
> it. One way would be to improve the build processes, and be able to do
> incremental builds of art assets without any additional effort from
> programmers. Basically I'm trying to find a way to do some work for free
> with python.
>
> The idea is that there are many different types of assets, e.g. 3D models,
> textures/other images, animations, audio, spreadsheet data, etc. Each of
> these generally has some tool that converts it from the source format to the
> format that is stored in the game on disk / in memory. Hence they are
> usually simple command line data crunchers. They take some files as input
> and just produce other files as output.


Check out SCons; it's specifically designed to be extensible in just
this way to handle different utilities for building different file types,
as well as allowing you to write scanners to return dependencies based on
any mechanism you can code up in Python. SCons is already in use by a
number of gaming companies to speed up and improve their builds.

--SK
 
Reply With Quote
 
John Roth
Guest
Posts: n/a
 
      04-10-2004

"Moosebumps" <(E-Mail Removed)> wrote in message
news:vnHdc.50052$(E-Mail Removed). com...
> >
> > In answer to the question you /almost/ asked:
> >
> > http://www.google.com/search?q=python+make+replacement
> >

>
> That is definitely of interest to me, but I would want to go one step
> further and automatically generate the dependency info. I haven't looked
> specifically at these make replacements, but I would assume you have to

use
> a makefile or specify dependency info in some form like a text file. What

I
> am looking for is a way to automatically generate it from the source code

of
> the individual tools that the make program will run, or by running the

tools
> in some special mode where they just spit out which files they will
> read/write.


SCons is what you want, then. It's got a scanner built in that can
be subclassed to scan anything to pull out dependency information
on the fly. Converting a build monstrosity to SCons isn't exactly
simple, but it's a lot simpler than any of the alternatives I can think
of.

John Roth
>
> MB
>
>



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Are there in Python some static web site generating tools likewebgen, nanoc or webby in Ruby ? KLEIN Stéphane Python 3 03-09-2010 11:28 AM
isthere any tools that could help me check include dependency between c++ files$B!)(B thinktwice C++ 10 07-01-2008 01:35 AM
[ANN] maven-jstools-plugin v0.2: JS reporting and dependency tools for Maven projects manos Javascript 0 10-06-2007 05:11 PM
library dependency tools? BioInfoGuy Java 0 05-04-2006 04:47 PM
Build tools: dependency checking, Ant and Javamake Karsten Wutzke Java 3 06-05-2005 05:01 AM



Advertisments