#5-2020-Feb-“DRY System for Writing Python Code”

This article is an attempt to formalize my strategy for writing DRY python code. Let me showcase the system first and then you can see if it fulfils your needs in the FAQ.

Why do you need a system in the first place?

  1. A fundamental goal when writing code is to stay at a high abstraction level (fewer lines of code -> less bugs). Therefore, you need a way of reusing your code.

  2. Running around your directory tree to find that code you where working on 3 months ago because you need it now is a pain

  3. I am sloppy. I find shortcuts. I copy-paste. It works in the short term but after a while my computer and my brain is a MESS. I need a rules. I need a system

Directory structure

workspace
├── delete_me # package for writing scratches, can be deleted anytime
├── external # directory for cloning external repositories
│   ├── kubernetes_stable # non-python external repository
│   └── faust 
│       ├── repo # actual cloned repo
│       └── python_env # environment needed to run code in repo
├── private_pypi # A private repository for storing your python src code
│   ├── src # In-use production code
│   |   └── anki # importable by: import anki
│   │       ├── anki_connect.py # importable  by from anki.anki_connect import connect
│   │       └── launch.py # usually an entry point used by `python launch.py`
│   ├── src_legacy
│   |   ├── legacy_package # Something not used anymore
│   │   │   └── legacy.py
│   |   └── legacy_package_2 # Something not used anymore
│   ├── tests
│   |   └── tests_anki
│   │       └── test_anki_connect.py
│   └── tests_legacy
│       └── tests_legacy_package
│           └── test_legacy.py
├── snapshots # directory for storing snapshots of code
│   └── anki_launch # entry point module path
│       ├── 2020_01_09_10_10 # snapshot made
│       │   ├── requirements.txt
│       │   ├── anki
│       │   │   ├── anki_connect.py
│       │   │   └── launch.py
│       │   └── utils
│       │       └── str_utils.py
│       └── 2020_02_09_12_43 # snapshot made
│           ├── anki
│           │   ├── anki_connect.py
│           │   └── launch.py
│           └── utils
│               └── str_utils.py
└── work # Usually a repository for work
    ├── src # In-use production code
    |   ├── utils # importable by: import utils
    │   │   └── str_utils.py # importable  by from utils.str_utils import to_snake_case
    |   ├── company_package_1
    |   ├── company_package_1
    ├── src_legacy
    |   ├── legacy_package # Something not used anymore
    │   │   └── legacy.py
    |   └── legacy_package_2 # Something not used anymore
    ├── tests
    |   └── tests_utils
    │       └── test_str_utils.py
    └── tests_legacy
        └── tests_legacy_package
            └── test_legacy.py

FAQ

Where is the __init__ files?

They are in every package (not shown above to reduce clutter)

Coding context, where you should write the code?

  • Writing a scratch, something small, never meant to be deployed -> /delete_me/your_module.py

  • Writing for work -> work/{src,tests}

  • Writing for private project -> private_pypi/[src, tests]

  • Writing for an open source project -> external/{project_name}/repo

Note

Never write code in the snapshots/* or work/src_legacy or private_pypi/src_legacy directory

What is on the PYTHONPATH?

During private development or writing a scratch:

  • private_pypi/src

  • private_pypi/tests

  • work/src

  • work/tests

During work development:

  • work/src

  • work/tests

Why do you use src_legacy and tests_legacy?

  1. Make it easier to delete code. I struggle to delete code. Putting it into a legacy folder makes it easier. And if I need it later I just move it back to the src

What happens when there is a conflict (private_py/src/utils and work/src/utils)?

You are reusing code! There is a few options

  1. Consider moving the utils into a new repository that lives by itself that follows the same structure as private_pypi or work

  2. Fuck it, keep both -> Warning you are not DRY anymore, you will have syncing issues and a split brain problem!

  3. Rename package to something more specific: work/src/dict_utils

My utils.str_utils depends on a constant specific to work/private_pypi

Introduce an environment variable!

How do I deploy my code?

I use a script I have created (package_exporter.py) which takes the module_path, (anki.launch) and an output dir (/snapshots/anki_launch/2020_01_09_10_10) and outputs all dependencies into the output directory with a requirements.txt file of the 3rd party dependencies.

Side note: I also have a build/docker/anki_launch directory that is created/updated with a Dockerfile and docker-compose.py file

Side note 2: I also generate/update the files of a helm chart, deployment.yaml, chart.yaml and values.yaml

A word of caution

Warning

Depending on work packages from private_pypi is ok as long as it is not business-specific (you don’t want to expose company secrets). However, the other way is a problem. If you leave the company, the work can not depend on your work! (Unless you want them do be dependent on you)

Conclusions, Failed alternatives, and Future reading

Conclusions:

  • Staying DRY is hard

  • This system assumes you are only running one distribution of python (3.8.1 in my case)

  • It also assumes that you try to follow a single-repository for you work and private code.

  • Modifying PYTHONPATH to fit your need is useful

  • Writing code is fun. But setting up dependencies and installing packages are not fun.

Failed alternatives:

  1. Copy pasting code

  2. Using an ea_ prefix for all my private packages. (Needed to sync back and fourth, which I worked around with scripts, but still split brain problems)

  3. Running my own pypi locally and package each shared package and depend on it like a 3rd party dependency. Works, but a bit more painful to make changes since you require a full publication in between and updating the version numbers of your local packages.

Future reading:

  1. How do you detect dead code?

  2. How do you test your code?

  3. How do you document your code?

  4. How do you use environment variables in your code (aka how to use secrets safely and allow customization)?

  5. How do you deploy your code?

  6. How do you keep code-quality high?

  7. How do you backup your code?