Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > How to read files written with COBOL

Reply
Thread Tools

How to read files written with COBOL

 
 
Batista, Facundo
Guest
Posts: n/a
 
      05-10-2004
People:

I'm trying to convert my father from using COBOL to Python,

One difficult thing we stuck into is how to read, from python, files written
with COBOL.

Do you know a module that allows me to do that?

It should avoid us the work to write a COBOL program that open the COBOL
file and write a CSV one (easily readable from python).

Thank you all!

Facundo Batista
Desarrollo de Red
http://www.velocityreviews.com/forums/(E-Mail Removed)
(54 11) 5130-4643
Cel: 15 5132 0132



 
Reply With Quote
 
 
 
 
John Roth
Guest
Posts: n/a
 
      05-10-2004

"Batista, Facundo" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> People:
>
> I'm trying to convert my father from using COBOL to Python,
>
> One difficult thing we stuck into is how to read, from python, files

written
> with COBOL.
>
> Do you know a module that allows me to do that?
>
> It should avoid us the work to write a COBOL program that open the COBOL
> file and write a CSV one (easily readable from python).


What's the OS for the two languages? COBOL from mainframe
to X86ish is very different from some flavor of Windows or Unix
COBOL.

Also, are we talking fixed or variable length records? And if
variable, how are they structured?

In either case, I think the struct module (under String Services)
is what you're looking for.

John Roth
>
> Thank you all!
>
> Facundo Batista
> Desarrollo de Red
> (E-Mail Removed)
> (54 11) 5130-4643
> Cel: 15 5132 0132
>
>
>



 
Reply With Quote
 
 
 
 
asdf sdf
Guest
Posts: n/a
 
      05-10-2004
Batista, Facundo wrote:
> People:
>
> I'm trying to convert my father from using COBOL to Python,
>
> One difficult thing we stuck into is how to read, from python, files written
> with COBOL.
>
> Do you know a module that allows me to do that?
>
> It should avoid us the work to write a COBOL program that open the COBOL
> file and write a CSV one (easily readable from python).
>
> Thank you all!
>
> Facundo Batista
> Desarrollo de Red
> (E-Mail Removed)
> (54 11) 5130-4643
> Cel: 15 5132 0132
>
>
>

i'm going to watch this thread with interest. a couple of weeks ago, i
asked about python to legacy mvs particularly for DB2 and Adabas access.
i got zero responses which suggested to me that no tools or modules
are in wide use.

i think you are undertaking a simpler problem generally. if all your
records are text it should be fairly straightforward. if not, you'll
need to figure out how to map COBOL data representations into python.

i seem to remember COMP-3, COMP-5 and packed decimal formats, among
others. what they mean, i dont't know, but generally various floating
and fixed point formats.

you also need to handle REDEFINES which is used to produce a c-union
sort of arrangement, where multiple formats can be used to access the
same record.

88-Levels are a similar problem.

after Y2K, a lot of COBOL files contain some non-obvious date handling,
which could involve bit manipulation.

if you learn of any sorts of tools at all, please post them back here.
python screen scrapers, python compatible database drivers, anything at all.

interesting project idea: a COBOL to python _code_ converter. should
be feasible, in light of COBOL's very limited syntax.

ah, COBOL fun. all us old guys are reflecting on how glad we are we
left it behind.

it might be a good exercise for your dad, if he wants to retool himself,
and he already knows all the data format stuff.


 
Reply With Quote
 
John Roth
Guest
Posts: n/a
 
      05-10-2004

"asdf sdf" <(E-Mail Removed)> wrote in message
news:VYSnc.46990$(E-Mail Removed). com...
> Batista, Facundo wrote:
> > People:
> >
> > I'm trying to convert my father from using COBOL to Python,
> >
> > One difficult thing we stuck into is how to read, from python, files

written
> > with COBOL.
> >
> > Do you know a module that allows me to do that?
> >
> > It should avoid us the work to write a COBOL program that open the COBOL
> > file and write a CSV one (easily readable from python).
> >
> > Thank you all!
> >
> > Facundo Batista
> > Desarrollo de Red
> > (E-Mail Removed)
> > (54 11) 5130-4643
> > Cel: 15 5132 0132
> >
> >
> >

> i'm going to watch this thread with interest. a couple of weeks ago, i
> asked about python to legacy mvs particularly for DB2 and Adabas access.
> i got zero responses which suggested to me that no tools or modules
> are in wide use.


I missed seeing it, somehow, but you're also right: I don't know
of any tools either.

> i think you are undertaking a simpler problem generally. if all your
> records are text it should be fairly straightforward. if not, you'll
> need to figure out how to map COBOL data representations into python.


In other words, take the 01s under the FD and create an object
that would expose all the converted data elements for the record?
Could be a somewhat interesting project, and it shouldn't be all
that hard since data descriptions are a fairly limited syntax.

> you also need to handle REDEFINES which is used to produce a c-union
> sort of arrangement, where multiple formats can be used to access the
> same record.


Redefines in implicit - it's just multiple level 01s under the same FD.

> 88-Levels are a similar problem.


Aren't an issue. 88s are basically an isXXX type function call. That's not
how they're implemented, but that's the basic semantics.

> after Y2K, a lot of COBOL files contain some non-obvious date handling,
> which could involve bit manipulation.
>
> if you learn of any sorts of tools at all, please post them back here.
> python screen scrapers, python compatible database drivers, anything at

all.
>
> interesting project idea: a COBOL to python _code_ converter. should
> be feasible, in light of COBOL's very limited syntax.
>
> ah, COBOL fun. all us old guys are reflecting on how glad we are we
> left it behind.


Ain't that the truth!

John Roth


 
Reply With Quote
 
Steve Williams
Guest
Posts: n/a
 
      05-11-2004
Batista, Facundo wrote:
> People:
>
> I'm trying to convert my father from using COBOL to Python,
>
> One difficult thing we stuck into is how to read, from python, files written
> with COBOL.
>
> Do you know a module that allows me to do that?
>
> It should avoid us the work to write a COBOL program that open the COBOL
> file and write a CSV one (easily readable from python).
>
> Thank you all!
>
> Facundo Batista
> Desarrollo de Red
> (E-Mail Removed)
> (54 11) 5130-4643
> Cel: 15 5132 0132
>
>
>

I wrote an ETL system in python for a client to convert from Microfocus
COBOL to DB2. Here are some of the problems I saw:

1) COBOL has a very rich set of datatypes defined by the PICTURE clause

character
unsigned integer
zoned signed integer
integer trailing sign separate
integer leading sign separate
packed signed decimal
packed unsigned decimal
floating point

with the usual COBOL zoo of implied decimal points and scaling

Not to mention COBOL allowing formatted numeric data to be
used as source fields in arithmetic operations.

In my application, each of these types was converted by a
parameter-driven function.

That is, I took the original COBOL 01 level definition and
converted it to a list with definition parameters name, type,
length, decimal point, etc. to make it easy for Python and
to add some stuff to make DB2 happy (convert to title case. . .)

I doubt if you can easily write a parser for the COBOL PICTURE
clause and for most cases it would be a waste of time. I just
converted the definition by using 'replacing all occurences' in
a text processor.

I had the most problem with Microfocus unsigned decimal, as
I'd never seen it before.

2) Reading fixed and variable length records wasn't much of a problem

Reading Microfocus keyed sequential data with embedded indexes
took some bit-level coding.

3) None of this would be remotely attractive to a COBOL programmer.
Converting the data to CSV, however, might get his attention
as it's pretty easy in Python and not much fun in COBOL.

I you want to sell dad, talk about text and string processing
in Python.

 
Reply With Quote
 
asdf sdf
Guest
Posts: n/a
 
      05-11-2004
Steve Williams wrote:

> I wrote an ETL system in python for a client to convert from Microfocus
> COBOL to DB2. Here are some of the problems I saw:
>
> 1) COBOL has a very rich set of datatypes defined by the PICTURE clause

<...snipping various items...>

> That is, I took the original COBOL 01 level definition and
> converted it to a list with definition parameters name, type,
> length, decimal point, etc. to make it easy for Python and
> to add some stuff to make DB2 happy (convert to title case. . .)

Steve,

I've been looking for ideas on getting at DB2 and Adabas from Python.
You might have some thoughts.

Is it feasible to go to directly to MVS/DB2/Adabas from Python on Unix
or Win?

Is it more realistic to hit DB2 on AIX or Linux and use some kind of DB2
linking or replication to reach DB2/MVS?

Other ideas? Maybe 3270 emulation with screen scraping? How about
telnet 3270? (Hundreds years of ago, I could dial into a command line
MVS environment.)

I don't mean to hijack the thread. I think this is related and might be
helpful to unfortunates to have to interoperate with legacy systems.







 
Reply With Quote
 
Steve Williams
Guest
Posts: n/a
 
      05-12-2004
asdf sdf wrote:
> Steve Williams wrote:
>
>> I wrote an ETL system in python for a client to convert from
>> Microfocus COBOL to DB2. Here are some of the problems I saw:
>>
>> 1) COBOL has a very rich set of datatypes defined by the PICTURE clause

>
> <...snipping various items...>
>
>> That is, I took the original COBOL 01 level definition and
>> converted it to a list with definition parameters name, type,
>> length, decimal point, etc. to make it easy for Python and
>> to add some stuff to make DB2 happy (convert to title case. . .)

>
> Steve,
>
> I've been looking for ideas on getting at DB2 and Adabas from Python.
> You might have some thoughts.
>
> Is it feasible to go to directly to MVS/DB2/Adabas from Python on Unix
> or Win?
>
> Is it more realistic to hit DB2 on AIX or Linux and use some kind of DB2
> linking or replication to reach DB2/MVS?
>
> Other ideas? Maybe 3270 emulation with screen scraping? How about
> telnet 3270? (Hundreds years of ago, I could dial into a command line
> MVS environment.)
>
> I don't mean to hijack the thread. I think this is related and might be
> helpful to unfortunates to have to interoperate with legacy systems.
>
>
>
>
>
>
>

Well, the application processed a lot of data on a nightly basis. It
used FTP to connect to the COBOL machine (an AIX box) and FTP callbacks
to sequentially read the files and convert the the data. There are two
a bugs in the Python FTP module that surface if the file size is larger
than 2 gig, but they're easily fixed.

I developed this application on Windows, initially targeting a test DB2
database on Windows and then moving the DB2 database to AIX and posting
with ODBC over the network from Windows.

In the full production environment I moved the Python
application to AIX. The moves were straightforward--Python was platform
independent for my purposes.

Initially I used ODBC or the API to post the data to DB2, but
that turned out to be slow. To get the speed I needed, I just wrote
the converted data to a CSV flat file and passed the file to the
DB2 loader utilities. No matter how good your code is, you'll never
outperform the database utilities.

I've never used replication or linking. I know nothing about DB2 on
MVS. In general, my experience with DB2 on networks (admittedly Unix
and Windows boxes) tells me accessing DB2 on MVS over a network would
not be a problem. I know nothing about ADABAS.

Python will certainly do TELNET and screen scraping, but life is short.

Other than the overall success of the project (I've been told successful
data warehouse projects are rare) the major benefit of using Python was
the ability to try new concepts quickly. With python you have
enormous flexibility, as opposed to compiled languages (COBOL, C, etc)
or third party ETL utilities.

As an example, my application converted accounting data on
a nightly basis. With no advance warning, the Accounting department
converted to another package. The python code to extract and load
the data from the new system was written and in production in 2 days.

 
Reply With Quote
 
Buck Nuggets
Guest
Posts: n/a
 
      05-14-2004
Steve Williams <(E-Mail Removed)> wrote in message news:<nJhoc.186646$(E-Mail Removed) >...
> asdf sdf wrote:
> > Is it feasible to go to directly to MVS/DB2/Adabas from Python on Unix
> > or Win?


At least for DB2 this shouldn't be a problem - but would typically
involve a separate product - called "DB2 Connect". Shouldn't be cheap
or require any MVS components:
http://www-306.ibm.com/software/data/db2/db2connect/

> > Is it more realistic to hit DB2 on AIX or Linux and use some kind of DB2
> > linking or replication to reach DB2/MVS?


No, DB2 Connect should give you odbc, jdbc, cli, etc protocols
directly to mvs. You can go through another db2 database, but that's
probably extra work & complexity.

> Other than the overall success of the project (I've been told successful
> data warehouse projects are rare) the major benefit of using Python was
> the ability to try new concepts quickly. With python you have
> enormous flexibility, as opposed to compiled languages (COBOL, C, etc)
> or third party ETL utilities.


Nice case study. I've been building ETL systems for twelve years and
am on my second python etl project right now. Python has proved
itself the best option - there's nothing like adaptability when you've
got a dozen system interfaces to maintain! And its quick learning
curve has meant that bringing others up to speed has been a snap.

Most of my communication with db2 is just over the command line (via
popen2.Popen3) which is the only way to issue commands such as load,
export, force application, list application, etc. However, quite a
few of my summaries are run this way as well (typically mass inserts)
and aside from the primitive error codes, it works fine. There's also
at least one db2 python package (PyDB2). Here's a link to the
package:
http://sourceforge.net/projects/pydb2/
and here's a link to a tutorial for it:
https://www6.software.ibm.com/reg/de...91&S_CMP=DB2DD
I'm not using it yet, though a coworker just installed and started
using a python db2 module - I assume that it is this one.

And as far as reading files written in COBOL, here's a few thoughts:
1. don't make python read all the COBOL data types, instead make the
COBOL program write out a plain ascii record. Writing to a
fixed-length ascii record is very simple (if a little tedious to parse
on the other side).
2. if you can't modify the COBOL output...then you could consider a
commercial (perhaps with a free trial license) product that already
provides COBOL 'copybook' interpretation. There are quite a few of
these, though the least expensive ones I'm aware of are SyncSort, Data
Junction, and perhaps Compuware's FileAid. Don't think any have a
regular license for less than $1500.
3. if you have to read non-character cobol files, then I'd try to
just keep the number of options down to a reasonable number: you may
only need to support a few formats - such as zoned & packed decimal
(comp-3) for instance. Variable length files, float, comp-4, isam,
etc aren't that common. Redefines are often used in conjuction with
record types, and this can be sometimes simplified by just splitting
the file into multiple separate files by record type. And all the
formatting in the picture clause can be easily handled in the program
that reads the files (implied decimal places, signs, etc are all very
simple).

buck
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Code to read Cobol data files none Python 2 09-22-2005 07:38 AM
RE: How to read files written with COBOL Batista, Facundo Python 1 05-10-2004 10:02 PM
cobol and binary data written by C hpy_awad@yahoo.com C Programming 5 05-10-2004 12:26 AM
Read COBOL database (DAT, IDX) with JDBC Ferro Java 7 10-29-2003 01:05 PM



Advertisments