I’m a Systems/Software Engineer in the San Francisco Bay Area. I moved from Columbus, Ohio in 2007 after getting a B.S. in Physics from the Ohio State University. I'm married, and we have dogs.

Under my github account (https://github.com/addumb): I open-sourced python-aliyun in 2014, I have an outdated python 2 project starter template at python-example, and I have a pretty handy “sshec2” command and some others in tools.

I don't write

October 30, 2019

I don’t write, and I often get feedback that it’s hurting my career in software engineering.

Disclaimer: I’m butthurt about struggling to advance in my career.

The following is what happens when I try to write anything (this included):

  • I think I have a good idea
  • I flesh it out into fairly broad strokes
  • I start writing
  • I doubt my idea
  • I fear being bullied
  • I delete what I wrote

Why does this hurt my career? Because I’m a software engineer. We conflate visibility and productivity in this industry, thinking the engineer who writes Medium articles all the time must know their shit because they seem to have quite a following.

I’m not saying I shouldn’t have to broadly communicate. There’s obviously value in broadly communicating. It’s why the fucking printing press was a world-changing invention. I’m not talking about spreading the Good News here, though. I’m talking about rehashing architecture choices, workflow recommendations, production operations plans, and shit as banal as unit test coverage. Why would I bore the world or my peers with things that are trivial to search for? How could that possibly be valuable? How is that required for me to “level up”?

I’d like to explore why I don’t write, particularly for work. So let’s look at my steps of abandonment.

How to Tank Any Technical Career by “Not Communicating Enough”

I think I have a good idea

We all have good ideas. We’re humans. Each of us has a unique experience and can share ideas which others haven’t had. Stupid simple. I don’t have confidence to call many of my ideas good, but I figure some probably are good.

I flesh it out into fairly broad strokes

This is actually a pretty quick process: trim it down to the bare essentials, don’t demean anybody’s work, have a strong recommendation or request of the coworker who reads it. Easy peasy.

I start writing

For larger documents or plans, where more context and guiding is needed, I go off the rails pretty easily. I get bogged down in minutiae. I start trying to cover every angle of technical attack this thing may face.

I doubt my idea

As I’m writing a defense without an attack, I start to convince myself that it’s not worth writing. That it contradicts something else I hadn’t seen. That it’s obvious, not novel. That I’m missing some piece of context that makes the whole thing moot.

I fear being bullied

I can hear it now: “well you probably just can’t take feedback very well if you always think it’s bullying.” Sure, fine. There’s nothing I can say to that. It’s just shutting down a conversation. If that’s what you’re thinking, then this isn’t for you. Otherwise, you may have experienced this. The Super Sr. Hon. Principled Engineer XIV stepping in to the idea you had and shitting all over it. Not because they have a better idea, but because they can shoot mine down without bothering to consider it. Because one of the key words I used matched a thing they own. Because I recommended using a system they didn’t write. Because they can.

I delete what I wrote

Not this time!

That’s all, I mostly needed to vent. But in the spirit of not deleting what I wrote, here are snippets from earlier versions of what I thought I wanted to say…


I’m talking about advancing a career through sheer confidence. Faking it until you make it seems to be a requirement in software engineering career progression.

I do this personally, for nerd stuff or regular-ass personal stuff. I have a handful of partial drafted things I’ve written out. One weird thing about the personal ones is the audience. The audience is always me. I can’t write advice since I’ll just doubt the value of it. I mean hey, it didn’t help me the first time around, right? So I write to straighten my thoughts, to categorize and reinterpret.

I do this abandonment process at work almost every day. Let’s say I have an idea, or a direction shift, which I need a lot of people to get on-board with before the wheels come off of the product/app/website/project/whatever. So, I say it in a chat and a couple people agree, then I say it more broadly and get complete and utter silence. I could bring it up in a meeting, but I usually talk myself down from that by pointing out what happens every time I do: people say that’s a good idea, pretend it takes 10x the engineering time it does, and proceed to politely tell me to go fuck myself by saying that this next deadline is more important.

This has trained me not to share my ideas widely. It has taught me that my ideas are bad.

So I instead have 1:1 or small group conversations about technical direction, driving everybody to a higher bar and generally getting people excited to work on the most boring stuff possible: reliability. I do this a lot at work, pointing out subtle tactical changes and creating a new vision of whatever it is I’m working with.


I moved addumb.com into GitHub pages

March 07, 2016

It was kind of fun :) Believe it or not, I actually have a couple other pages under addumb.com aside from shitty blog posts:

Those were all really easy to move. Moving my shitty blog from wordpress.com into GitHub pages was a little more complicated than all the guides make it seem. At a high level, you’ll do these four big steps:

  • Do the normal shit for GitHub pages (make a new repo, setup a jekyll site)
  • Import your Wordpress blog.
  • Fix all the stuff that just broke all over the place

GitHub Pages

The GitHub docs on this are pretty straightforward. You should end up with a site with a directory structure like this:

$ tree -L 2
.
├── CNAME
├── Gemfile
├── README.md
├── _config.yml
├── _includes
│   ├── footer.html
│   ├── head.html
│   ├── header.html
│   └── sidebar.html
├── _layouts
│   ├── default.html
├── derp.html
├── index.md
├── posts.md
├── sitemap.xml
└── vendor
    └── bundle

The contents of CNAME here is just “addumb.com” for me. Don’t do that for yours. Gemfile only contains 2 lines right now: “source ‘https://rubygems.org’” and “gem ‘github-pages’”. README.md is just a regular GitHub readme. _config.yml has some annoying pieces you may need to dig up, particularly this:

markdown: kramdown
kramdown:
  input: GFM
  force_wrap: false

Then the _includes are just your site template pieces, _layouts just has a default for now which is basically empty:

<!DOCTYPE html>
<html>
{% include head.html %}
<body>
  {% include header.html %}
  {{ content }}
  {% include footer.html %}
</body>
</html>

You should be able to run this to get your jekyll site running locally:

$ bundle exec jekyll serve

The Wordpress.com Import Stuff

Now that you have a fancy new GitHub Pages site up and running, it’s time to move your shitty blog over to it from Wordpress.com. The steps are very similar for other Wordpress setups, so try to read between the lines.

Warning Make a new branch in your repo, you’re going to totally fuck shit up.

Okay, then follow along here…

  1. Login to your wordpress.com account, then your “site”, then export it all as a single XML file.
  2. Add the jekyll-import dependency to your Gemfile where you should already have github-pages:

    source 'https://rubygems.org'
    gem 'jekyll-import'
    gem 'github-pages'
    

    Don’t try to add this at the outset. Install github-pages, and then jekyll-import. They have a fake version conflict.

  3. Now run this to get your import started:

    bundle exec jekyll import wordpressdotcom --source-file ~/Downloads/addumb.wordpress*.xml
    

    This will spew all kinds of unhelpful gem dependency failures for a while. One by one, you’ll need to fix them. This step is total bullshit.

    Whoops! Looks like you need to install 'hpricot' before you can use this importer.
    
    If you're using bundler:
      1. Add 'gem "hpricot"' to your Gemfile
      2. Run 'bundle install'
    
    If you're not using bundler:
      1. Run 'gem install hpricot'.
    

    This is pretty straightforward:

    echo "gem 'hpricot'" >> Gemfile
    bundle install
    bundle exec jekyll import wordpressdotcom --source-file ~/Downloads/addumb.wordpress*.xml
    

    and repeat until you don’t get a Gem error.

  4. Okay, now you have a steaming pile of malformatted “.html” files under _posts. Each one has “frontmatter” at the top of the page, which is put between YAML comment markers: ---. The frontmatter is just YAML describing some specifics about each post. Go through each one and clean up the garbage left over, when you’re done it should look something like this post’s frontmatter:

     ---
     layout: post
     title: I moved addumb.com into GitHub pages
     date: 2016-03-07 11:26:00.000000000 -07:00
     type: post
     published: true
     status: publish
     description: Moving a blog from wordpress and website from AWS to GitHub Pages.
     keywords: github pages, aws migration, wordpress export
     categories:
     - Tech
     - Tip
     tags:
     - Tech
     author:
       login: addumb
       email: [email protected]
       display_name: addumb
       first_name: 'Adam'
       last_name: 'Gray'
     ---
    

    You should have removed garbage like this:

     meta:
     _publicize_pending: '1'
     _edit_last: '16162427'
     _wp_old_slug: '1'
     original_post_id: '1'
    

The Layout(s)

The import created a bunch of _posts which reference a layout called post. What is that and how do you create it and make it somewhat useful? (I cannot give any advice on making it truly useful.)

You might want to make the post layout so that your posts aren’t rendered plainly, like this. Here’s a starter, just plop this in _layouts/post.html:

<!DOCTYPE html>
<html>
  <head>
    <title>
    {% if page.title %} {{ page.title }}
    {% else %} {{ site.title }}
    {% endif %}
    </title>
  </head>
  <body>
    <h1>{{ page.title }}</h1>
    {{ content }}
  </body>
</html>

That will just make a web page with your post’s title in the title bar of the browser along with a big header up top, then your unadulterated goodness underneath.

Next, you may want to list your posts in a few different places. Here’s how I made the sidebar/bottombar list of posts here. I made a file in _includes named sidebar.html and it’s this:

Other posts...
<ul>
  {% for post in site.posts %}
  {% if page.url == post.url %}
  <li>&raquo; {{post.title}}</li>
  {% else %}
  <li><a href="{{post.url}}">{{post.title}}</a></li>
  {% endif %}
  {% endfor %}
</ul>

Then you can update your post.html layout to include this on the right-hand side however you’d like.

SEO

Ha! I have no idea what I’m talking about here, but generally do these things:

  1. Make a sitemap.xml.
  2. Add descriptions and keywords to your site and to each post. Check the head layout on this site: _includes/head.html
  3. Add some jQuery, that helps, right??
  4. Bootstrap something, is that still fashionable?

Walk Away

That’s it! That’s how I made this thing. It was fun :)


Quick Debian Backporting

March 10, 2014

Backporting

Suppose you're running Ubuntu and want to get a newer version of a package than what's provided by Ubuntu. The process of re-building a newer version of a package from a newer version of the Distro is called "backporting." This can become very dangerous if you start backporting highly-dependent packages like python, so just don't do that. Try to keep to the leaves of your distro's dependency tree.

Example

As a specific example, I run Ubuntu 12.04 ("precise"), which comes with nose version 1.1.2-3, but I want something newer so that I can use the --cover-xml option! This option was introduced at https://github.com/nose-devs/nose/commit/868ce889f1b6cf6423fdd56fbc90058c2f4895d8 and first released in 1.2.

I want to backport nose >= 1.2 to Ubuntu Precise.

Do This

  • Go to http://packages.ubuntu.com/source/precise/nose and click through the different versions of Ubuntu on the top right until a version we want is there.
  • Saucy has nose 1.3.0, so I'll use that!
  • Copy the URL to the .dsc file under the "Download Nose" section: http://archive.ubuntu.com/ubuntu/pool/main/n/nose/nose_1.3.0-2.dsc
  • sudo apt-get install devscripts pbuilder
  • sudo pbuilder --create --distribution precise
  • Get some coffee, this will take a while.
  • dget http://archive.ubuntu.com/ubuntu/pool/main/n/nose/nose_1.3.0-2.dsc
  • sudo pbuilder --build --distribution precise nose_1.3.0-2.dsc

If all went well, you should now have some .deb files in /var/cache/builder/result!

If you're like me and everything did NOT go well because you tried this in a VM with only 512MB RAM, you probably got some test failures like this:

======================================================================
FAIL: Doctest: test_issue270.rst
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python3.2/doctest.py", line 2153, in runTest
raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for test_issue270.rst
File "/tmp/buildd/nose-1.3.0/build/tests/unit_tests/test_issue270.rst", line 0
----------------------------------------------------------------------
File "/tmp/buildd/nose-1.3.0/build/tests/unit_tests/test_issue270.rst", line 17, in test_issue270.rst
Failed example:
run(argv=argv, plugins=[MultiProcess()])
Exception raised:
Traceback (most recent call last):
File "/usr/lib/python3.2/doctest.py", line 1288, in __run
compileflags, 1), test.globs)
File "", line 1, in
run(argv=argv, plugins=[MultiProcess()])
File "/tmp/buildd/nose-1.3.0/build/tests/nose/plugins/plugintest.py", line 412, in run_buffered
run(*arg, **kw)
File "/tmp/buildd/nose-1.3.0/build/tests/nose/plugins/plugintest.py", line 372, in run
buffer = Buffer()
File "/tmp/buildd/nose-1.3.0/build/tests/nose/plugins/plugintest.py", line 130, in __init__
self.__queue = Manager().Queue()
File "/usr/lib/python3.2/multiprocessing/__init__.py", line 98, in Manager
m.start()
File "/usr/lib/python3.2/multiprocessing/managers.py", line 527, in start
self._process.start()
File "/usr/lib/python3.2/multiprocessing/process.py", line 132, in start
self._popen = Popen(self)
File "/usr/lib/python3.2/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
----------------------------------------------------------------------

Warning!

You should be very careful when installing them since they are unvalidated backports and may break things unexpectedly. Keeping to dependency tree leaves is one way to mitigate this, since nothing in the distro itself depends them. It is then up to you to make sure any software you use works nicely with the backported software. This is not limited just .debs, but could be Python applications, production services, or even just some scripts you whipped up and forgot about until you need them to bring the site back up.


Considering different data systems?

November 15, 2013

Please feel free to agree or disagree. I'm likely downright wrong about a fair amount of this...

CAP

The first very high-level consideration is where would you like to sit on the CAP tradeoffs? Consistency across the entire dataset has some pretty stringent requirements. Availability of the dataset was traditionally thought of as a binary concept: is the DB up or down? But in these consumer internet days, it sometimes makes things a lot more relaxed if you allow yourself to have partial availability in your datasets, e.g. users hashing to ^0.* through ^4.* are down, or users hashing to 5 thru 9 are still OK. The partition tolerance part is where things get downright crazy... I'll try to get into some of those more awesomer and recent tradeoffs shortly and poorly.
CA: Highly consistent and available data has no (network) partition tolerance. Another way to say this is you have a single master which you failover when it dies. This is a very familiar situation for traditional DBAs and people who can afford Oracle :)
AP: Highly available and partition-tolerant systems have high consistency costs. A typical setup here is offline processing: you have a place that accepts new writes, but nobody can read them until you copy out the data to all servers. That is strictly all servers, not "each server."
CP: Highly consistent and partition-tolerant systems have low availability, per their trade-off. An example here is actually zookeeper :) If you tried to use zookeeper as a general purpose datastore, you will notice very low "availability." As in your write latencies can approach infinity, which is indistinguishable from the service being down.
People usually end up picking something in-between these 3 wonky states.

Sharding

Extremely broadly, "sharding" is the process of dividing your dataset into smaller pieces (shards) each potentially available on a different computer. It's a way to compromize some consistency for partition tolerance. You get the additional benefit of giving you availabilities between "yes" and "no." The typical way to start sharding is to take an incoming piece of data from the system's input (a username, e.g.) and run it through a hashing function and designating ranges in the hashing function output to be served by different shards.
One of the biggest under-stated assumptions of sharding is that the key of data you are hashing on is an equally-valued range. If, for example, your company is a ticket-sales company, you probably see each individual ticket sold as roughly as unimportant as the next. A terrible key to shard on would be the event the tickets are for. Why would the ticket be good and the event be bad? In a shard failure event (partition or availability gone), you will lose acess to a specific percentage of your keys. If you keyed on event and the Superbowl happened to be one of those keys, you are going to lose a LOT of money. If you keyed on ticket, you will lose access to a specific percentage of tickets for all events, but you will retain access to many tickets for the Superbowl! The take-away here is to over-shard so that your unit of failure is small and to shard on equal business value data.
So now that you know what you wish you could shard on, look back to the product and try to find what I call a natrually ocurring key that's close to it. An example is you want to shard on user_id which is an integer you create, but users keep entering their username! The fix is to hash on the username (which you receive in the POST) to make lookups O(1) instead of O(your user_id lookup stack) as the userbase grows.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 United States License. :wq