2013-06-10

Programming Katas with Anki

Last year I was confronted with the fact that some of my technical skills have grown stale - at the point at which they were good enough to get the job done it seemed that all progress stopped.   I hadn't stopped learning - just stopped learning about certain technologies and products.   There were benefits to being in this rut like having extra time to spend on the needs of my organization and opportunities within my industry.   But all justifications aside the bottom line was that my productivity was being whittled away through small and continuous inefficiencies in my development methods.   I was using vim like a glorified notepad, I was using the mouse rather than keyboard short cuts on Firefox and Terminator, my use of Python was stuck in the 2.4 days, and I was consulting help on the find command far too often.

So, I settled on a strategy that I used many years ago - to learn one thing every day.   Which mostly worked.   Where it failed is in the specifics of the meaning of 'learn':  I was forgetting some things almost as quickly as I was learning them.   Certain items really required practice and repetition.

But why stop with just learning something well enough to do it slowly and painfully?   Ideally, I would know my tools so well that most common tasks don't require conscious thought at all.  This would eliminate unnecessary distractions and free me up to think about the harder problems at hand.   I want more of my development time to be in a mental state of flow.

2013-03-19

Gristle Slicer - of Architects, Chairs and Unix Utilities


There's an old story about two senior architects that were friends in college, and met again thirty years later.   After a few minutes they started talking about their favorite achievements.   The first described office towers, airports, and universities he was quite proud of.   The second didn't have any monuments to talk about, but shared that he thought he may have designed the perfect chair.t chair.    Clearly trumped, his friend congratulated him, and asked to hear more - since the perfect chair is far more significant than yet another monument.

Sometimes, I feel that small unix utilities are to a programmer what a chair is to an architect:  they continue to be essential, and are typically small, spare, do just a single thing and can clearly show elegance.

I've written quite a number of them, and have recently started packaging those related to data analysis into a project called DataGristle.   My favorite utility of the set is gristle_slicer - a tool similar to the Unix program cut.   While cut allows the user to select columns out of a file, gristle_slicer selects columns and rows - and uses the more functional Python string slicing syntax to do it.  

It's no perfect chair but it might be a good utility.  

2013-03-03

Installing Python with Pythonbrew

This is the third in a series about installing and managing multiple versions of Pythion on a linux host. The main article which describes why this is necessary for development and testing, or to upgrade back-level os-provided versions is here.


Pythonbrew makes installation and management of multiple Python versions within the local user account easy.   It has a few issues:
  • I've found that it can fail to build some versions of Python.   Working around this involves giving it more specific Python version names, or by going back and making sure all the host libraries are properly installed.
  • It requires at least Python 2.6 to use - so if you're stuck with an older version (say, 2.4 on RHEL 5) to start you'll need to first install something more current using another method.  This results in two different python install methods which is confusing.
Nonetheless, this is probably the best solution if you're running a modern python version for your OS, and can't install the versions you need via the Package Manager.

Installing Python with a package manager


This is the second part in a series about installing multiple versions of Python.   The main article which describes why this is necessary for development and testing, or to upgrade back-level os-provided versions is here.

2013-02-08

The Hidden Value of Crappy Experience

A friend was recently sharing his experience in helping his company fix a botched RAID recovery.  He was exhausted by the work and questioning the value of what he was learning from his employer.

But in our discussion we agreed that bad experience might be as valuable as good experience in some ways.    It might not be as fun, and you can certainly hit diminishing returns on it.   But the intense memories of those bad experiences are the basis for many of our most valuable instincts and judgements.

Well, at the time, it felt kinda nice to be able to help him put a happy face on his obnoxious job.

And then just this week I had the tables turned on me:  A server I depended on failed because a battery swelled, and then warped and took out the motherboard.   Even better, the backups only appeared to be working, but really weren't.   And that's not all: there was some code on that server that wasn't yet in version control because of chaos on the team at the time - after existing in limbo for two years it was slated to be added this week.    We were down for days replacing that server.   And I suppose we were lucky - it could have been worse.

All this got me thinking about my glib advice to my friend about the value of crappy experience.   Was there a silver lining for me in all this?   Did this week add to my mental catalogue of situations to avoid?   Well, just as an exercise I decided to write down what the catalogue might look like.   At least just the part that deals with backups & recoveries.   Here it is:

2013-01-22

Programmers and Practice vs Training

Lately I've been trying to learn one new thing every day.   And one of the sad discoveries that I've made is that I'm capable of forgetting things almost as quickly as I learn them.

So, two months after I learned how to write vim macros - I've already forgotten the specific keys used to define and run them.   Now, I can re-learn this very quickly - I've got good notes, the memories are just slightly hidden, and I haven't forgotten any concepts, just simple keys.   But this will slow down my use enough that I probably won't pull this tool out when I need it.

This got me thinking about how I needed to complement training with some  some repetition, some practice.   That just learning something isn't good enough.   This is exactly what martial artists do - they would call these katas.    It's also what musicians do - they would call these scales.


2013-01-21

Working With Multiple Environments in Python


Working within a single Python environment in which you're limited to a single version of python and a single version of all your modules is fine for small projects, but as the number of projects or age of projects increase it leaves you boxed in.  You can't test your software against a new version of a dependency without putting everything else on hold, and then you may have to revert back.  Eventually, you'll end up doing insufficient testing or spending far too much time on installing, uninstalling, and developing projects in series rather than parallel.

Ian Bicking's Virtualenv changed this in 2007 with the ability to create multiple environments for a Python application.  Then a year later Doug Hellman wrote Virtualenvwrapper which added a set of convenience functions that significantly enhance virtualenv.

The documentation for these products is generally very good, but it seems to me that something is missing - documentation that pulls everything together across products so that the beginning user has a simple recipe for success.  The closest would be The Hitchhiker's Guide to Python, but this doesn't address multiple Python versions.   And I really want to nail down how to do this with multiple versions of Python.

The following information is intended to be a first step in that direction for the linux developer.


2013-01-20

Working With Multiple Versions of Python


I've needed to run multiple python versions for a while now, and have finally bitten the bullet and invested the time to get this working.    The primary drivers of this need are two separate projects:
  • At my office we're running Redhat Enterprise Linux 5.8 which uses Python 2.4.  We've got an alternate install of Python 2.6 available, but I'd like to move immediately to Python 2.7 and in six months to Python 3.3.   During the transition I may have some components running on one version and others running on another.   This will help me incrementally move over.
  • One of my side projects, DataGristle, was written on Python 2.6, but is being used on Python 2.6 & Python 2.7 environments.   Testing it to ensure it works correctly on both previously required a second machine.   I'm getting tired of that and looking forward to eventual Python 3.1, 3.2, 3.3 compatibility.
But I've discovered that the documentation for this seems to only exist in bits and pieces distributed around the net.  The closest would be The Hitchhiker's Guide to Python, but this doesn't address multiple python versions and doesn't generally have command-by-command configuration instructions. The following information is intended to help put together a complementary, consolidated recipe to make this easier for a beginning Python developer on linux.

So, this is the first of two series about working with many environments.  The second series is about Working with Multiple Environments.

I've been updating these articles as update (aka corrections) are coming in, and experimentation with clean environments via VirtualBox slowly moves forward.  Thanks for the corrections and patience.


2012-10-10

Rediscovering the Benefits of Simple Design

Recently, I met a coworker from almost twenty years ago whose clearest memory of our time together was our discussions about design.   And how I got us all to make a field trip to the break room to take a look at the microwave oven there.

It had a dial and no buttons.  Pull the handle and it shut off the element automatically.   I loved the simplicity of this.   I loved how it made no demands of the user, and anybody could immediately put it to use.   There was no training, no documentation, no "insufficiently skilled users".

At the time we were rolling Microstrategy out to hundreds of users.    Microstrategy is a ROLAP (Relational On-Line Analytical Processing) tool that once you provided data in a relational database within a star-schema, defined that to Microstrategy in the form of metadata then any user could use it easily - they could quickly create new reports by dragging and dropping element names and it would generate the SQL for them.   It was a very powerful tool that in the right hands could achieve amazing results.   Prior to our roll-out of this tool the backlog on reports for our organization was ten months.   After we rolled it out I signed onto and delivered an 8-hour average SLA for the creation of new reports.


2012-09-20

In Praise of the Embaressingly Simple

Recently, while having lunch with some former members of my project the conversation drifted to some of the old code that's still around.   These guys are incredibly good programmers, and so many of their contributions are still running today - four to six years after they've left the project.

One of the items that we discussed was our "batch broker" - a process responsible for handing out unique batch ids - that uniquely identify processes, end up in logs, in audit tables, and sometimes tagged to rows in the database.

We laughed about how embarrassingly simple this process was: just a few dozen lines of python code that
  • open up and lock a file
  • increment the number within
  • close & lock the file
  • log the requester & new batch_id
  • return the batch_id
Our myriad batch programs (transforms, loads, publishes, etc) then simply call a bash or python function on their local system which calls this program remotely over ssh to get a new batch_id.   Total amount of code is maybe 50 lines across all libraries.