Projects in Python

Here are a number of literate programming projects focused on Python

Stingray – Schema-Based File Reader

Data Cleansing, Domain Analysis and ETL Tool.

Stingray wraps a number of spreadsheet flat-file reading components into a tidy package. It copes with COBOL files and even COBOL Copybook (or DDE) definitions so that simple Python programs can handle a wide variety of files.

The official project is on SourceForge: Stingray - Schema-Based File Reader.

Here is the woven literate programming documentation:


  • Wraps csv, xlrd, plus several XML parsers into a single, unified “workbook” structure to make applications that work with any of the common physical formats.
  • Extends the “workbook” to include fixed format files (with no delimiters) and even COBOL files in EBCDIC.
  • Provides a uniform way to load and use schema information. This can be header rows in the individual sheets of a workbook, or it can be separate schema information.
  • Provides a suite of data conversions that cover the most common cases.

PyVIX 2 – Controlling VMWare from Python

This is simple wrapper around VMware VIX that uses Python ctypes.

The official project is on SourceForge: PyVIX2.

This, too, is a literate programming exercise. See

Hierarchical Data Structures and Transitive Closures

My Transitive Closure example shows how to rapidly compute a transitive closure of a hierarchy and use that transitive closure to speed up SQL processing.

This performs transitive closures to reduce hierarchies to simple, flat lists of relationships. It speeds up SQL retrievals considerably. It also makes it possible to process arbitrary hierarchies.

This is one part of some more sophisticated Data Warehouse Extract Transform and Load (ETL) tools.


You’ll need a SQL database; the Gadfly database is what this demo uses. At some point, I should update the demo to use the built-in SQLite database.

Casino Games

Casino Games Simulator is a moderately complete simulator for Blackjack, Craps, Roulette and Caribbean Stud Poker.

Source: gamesim source. You can use pyWeb to create the documentation and source from the .w file.

This is an early version of the exericses in Building Skills in Object-Oriented Design.

Number Crunching

As part of any simulation effort, of course, you must check the quality of your random number generator. I wrote a complete empirical test of random number generators.

A report, Empirical Tests of Random Number Generators, has the overview of empirical testing along with the working Python source code.

This is built from three pyWeb modules. The Random Number Testing Source ZIP file has the source files. You can use pyWeb to create the documentation and source from the .w file.

I found that writing the modules the first time was relatively easy. After that, wrapping them in clear documentation, and packaging with pyWeb, lead me to some improvements and some recognition for other areas of improvement.

Literate Programming with pyWeb

Literate Programming is about developing source code and documentation side by side. pyWeb is a simple approach that works with almost any language and any kind of markup.

Literate programming was pioneered by Knuth as a method for developing readable, understandable presentations of programs. These would present a program in a literate fashion for people to read and understand; this would be in parallel with presentation as source text for a compiler to process and both would be generated from a common source file.

One intent is to synchronize the program source with the documentation about that source. If the program and the documentation have a common origin, then the traditional gaps between intent (expressed in the documentation) and action (expressed in the working program) are significantly reduced.

The complete “woven” document is pyWeb: In Python, Yet Another Literate Programming Tool.

The official project is on SourceForge: pyWeb Literate Programming Tool. The complete distribution includes unit tests.

About Python

The top-shelf object-oriented programming language is Python. It provides a relatively clean object-oriented environment, with the kind of direct support for OS features you expect from C (or Perl or Tcl). Most important, you can experiment and gracefully come to grips with good object-oriented design patterns.

While I know Tcl and PERL quite well, Python has advantages over both of these. First, Python is object-oriented. Of course, this can be bypassed through bad design. However, Tcl requires some serious work to be made object-oriented. Perl can be object-oriented, but it is difficult to get past the syntax. Perl suffers from an extreme case of MTOWTDI (“More Than One Way To Do It”), where there are not only several alternatives, but none that can be definitely called “the right way.”

Additionally, Python has advantages over Java and C++. The most notable advantage is the freedom from static bindings. Many people assert that compile-time static binding is the only thing that prevents total anarchy in the software engineering world. I think this is not completely true, my experience is that some programmers are stumped by static binding and work around it with simple-minded type casting where an overloaded function declaration would have been more appropriate. Also, the prevalent use of RTTI (Run Time Type Identification) indicates that not everyone can design static type bindings that cover the necessary cases.

date:Dec 20, 2017