Jekyll2021-10-04T17:37:00+00:00https://jamesabley.com/feed.xmlHello, is this thing on?Intermittent cake-hole flappings of a nerdSoftware is drowning the world2021-01-09T13:50:33+00:002021-01-09T13:50:33+00:00https://jamesabley.com/software-is-drowning-the-world<p>One of the many upsides I’ve had from working at lots of organisations
is that you get to see what’s common. Are things like this everywhere?
Frequently, the answer is yes!</p>
<p>An example of this is tech debt.</p>
<p>I see organisations which are running to stand still, and I’m not
sure they realised they’re doing that.</p>
<p>What do I mean by this?</p>
<p>Every time you decide to solve a problem with code, you are committing
part of your future capacity to maintaining and operating that code.
Software is never done.</p>
<p>Here’s a few examples of demonstrating what I mean:</p>
<h2 id="security">Security</h2>
<ol>
<li>You write a networked service to solve a business problem. Say
it has an HTML web UI</li>
<li>It has no known security issues</li>
<li>Time passes</li>
<li>You now have security issues with your code, and you should assess
whether you need to do work to address these.</li>
</ol>
<p>WAT?</p>
<p>Humans are terri-bad at writing secure code. And given enough time,
other humans will discover the security holes in your service.</p>
<p>This applies both to code your organisation writes, and the libraries
they use, or the operating systems, or web servers, or …</p>
<h3 id="security-examples">Security Examples</h3>
<p>Take your pick from browsing a CVE database, or use
<a href="https://snyk.io/">Snyk</a> or similar to look at your current codebases.</p>
<ul>
<li>TLS / SSL issues:
<ul>
<li><a href="https://en.wikipedia.org/wiki/POODLE">POODLE</a></li>
<li><a href="https://blog.zoller.lu/2011/09/beast-summary-tls-cbc-countermeasures.html">BEAST</a></li>
<li><a href="https://en.wikipedia.org/wiki/CRIME">CRIME</a></li>
<li><a href="https://en.wikipedia.org/wiki/BREACH">BREACH</a></li>
<li><a href="https://en.wikipedia.org/wiki/Heartbleed">Heartbleed</a></li>
</ul>
</li>
<li><a href="https://www.cvedetails.com/vulnerability-list.php?vendor_id=15183&product_id=31286&version_id=&page=1&hasexp=0&opdos=0&opec=0&opov=0&opcsrf=0&opgpriv=0&opsqli=0&opxss=0&opdirt=0&opmemc=0&ophttprs=0&opbyp=0&opfileinc=0&opginf=0&cvssscoremin=6&cvssscoremax=0&year=0&month=0&cweid=0&order=1&trc=20&sha=97513f3fa07a803c5507b2cf550af9877acd90f2">Spring</a></li>
<li><a href="https://www.cvedetails.com/vulnerability-list.php?vendor_id=45&product_id=6117&version_id=&page=1&hasexp=0&opdos=0&opec=0&opov=0&opcsrf=0&opgpriv=0&opsqli=0&opxss=0&opdirt=0&opmemc=0&ophttprs=0&opbyp=0&opfileinc=0&opginf=0&cvssscoremin=6&cvssscoremax=0&year=0&month=0&cweid=0&order=1&trc=70&sha=5369e34293062ebe460c99e6878e0792ac23944c">Struts</a></li>
</ul>
<h3 id="legislation">Legislation</h3>
<ol>
<li>You write a networked service to solve a business problem. Say
it has an HTML web UI</li>
<li>It has no known legal compliance issues</li>
<li>Time passes</li>
<li>You now have legal compliance issues with your code, and you should
assess whether you need to do work to address these.</li>
</ol>
<p>WAT?</p>
<p>The <a href="https://en.wikipedia.org/wiki/General_Data_Protection_Regulation">General Data Protection Regulation</a>
addressed organisations not handling data very well.</p>
<p>Privacy and Electronic Communications Regulations – mostly known
for mandating cookie policy.</p>
<p>The Equality Act 2010 (UK) and the Americans with Disabilities Act
1990 (2010 update) for website accessibility. Yes, there was a time
when people didn’t consider accessibility when building websites.</p>
<p>Brexit has meant a lot of changes for businesses in the EU and UK.
Software has been rewritten to manage the new trading relationships.
This will continue to happen for a while countries establish new
relationships.</p>
<h3 id="3rd-parties">3rd parties</h3>
<ol>
<li>You write a service to solve a business problem</li>
<li>You can build and release it when necessary</li>
<li>Time passes</li>
<li>You are now unable to build and release the service</li>
</ol>
<p>WAT?</p>
<p>3rd parties will change their APIs, or how things work. They may
do this for any number of reasons: performance, or security among
them. Older versions become deprecated, and unsupported. And these
older versions will still have new security issues reported against
them. So you need to upgrade, and adapt your code to use the new
API.</p>
<p>People building code libraries will strive to maintain backward
compatibility. But we still get <a href="https://semver.org/">semver</a> major
version changes, and breaking API changes.</p>
<h2 id="implications">Implications</h2>
<p>Most software needs constant maintenance. Building and operating
software has a cost which you should always factor in when deciding
to solve problems in that way.</p>
<p>A team working in a particular way can only be responsible for a
fixed amount of software. The amount of software should be managed,
otherwise the team will grind to a halt.</p>
<h2 id="proposition">Proposition</h2>
<blockquote>
<p>A team working in a particular way</p>
</blockquote>
<p>What if we change how they work?</p>
<p>Well yes, there are options there.</p>
<p>I’ve got a separate post (currently brewing) about Dunbar’s numbers,
but for this post, different sized organisations might have different
options. At a certain size, it makes sense to have people dedicated
to developer productivity and creating tools which improve the
capacity of other teams.</p>
<p>You can choose higher-level languages, and use technology stacks
from SaaS vendors which need less time from your people.</p>
<p>There is one option I had planned to spend researching last year
(but I ended up getting a job instead). This feels like potentially
a big market. I’ve seen lots of organisations with decade-old
codebases which are still running unsupported versions of dependencies
or frameworks.</p>
<p>As a developer, I’m familiar with a hammer, and was curious if I
could use it.</p>
<p>Can we have tooling that automates keeping software up-to-date?</p>
<p>I see this problem in every organisation I’ve ever worked in, with
all aspects.</p>
<p>Web applications/APIs written in any language. As mentioned above,
there are many reasons that software rots if left unattended. Mobile
apps also have this. Migrating versions of Android, or iOS, or …</p>
<p>Configuration/manifests for Infrastructure as Code aslo suffer from
this. Terraform hasn’t yet released 1.x, but there have been many
changes over the years. If you’re using Cloud Foundry or Kubernetes,
you’ll <a href="https://github.com/doitintl/kube-no-trouble">have experienced
changes</a> which mean
you need to do work.</p>
<p>Automating the changes needed in YAML for upgrading from Kubernetes
<code class="language-plaintext highlighter-rouge">n</code> to <code class="language-plaintext highlighter-rouge">n+1</code> feels like a widely useful tool.</p>
<h2 id="current-state">Current State</h2>
<p>There are some commercial things which do related work.</p>
<p>Snyk, <a href="https://github.com/renovatebot/renovate">Renovate</a>,
<a href="https://dependabot.com/">Dependabot</a> and other things exist which
can make pull requests to update dependencies. Mpost languages have
a package tool and bumping numbers is pretty straightforward. These
things tend to not be able to manage breaking API changes though.
Bumping a patch or minor dependency upgrade is fine, but a major
one with breaking API changes tends to need a human to get involved.</p>
<p>Why? Could we have a tool that solves this? When a new version of
<a href="https://spring.io/">Spring</a> is released, could it include an
accompanying set of transformations which will allow the entire
ecosystem to safely and rapidly upgrade?</p>
<p>Having a minor interest in compilers (and having worked on a
commercial interpreter), I tend to think of code editing operations
as transformations, rather than characters. There’s been <a href="https://www.facebook.com/notes/kent-beck/prune-a-code-editor-that-is-not-a-text-editor/1012061842160013">some
research in transformation-based
editors</a>,
but I’ve not seen a lot else.</p>
<p>Major version upgrades could potentially be similarly expressed in
terms of transformations, which similarly might be composed. So if
a class has been removed between major versions of a dependency,
the required transformation might be composed of:</p>
<ol>
<li>Insert new class <code class="language-plaintext highlighter-rouge">my.Y</code></li>
<li>Implement interface <code class="language-plaintext highlighter-rouge">spring.new.Z</code></li>
<li>Adapt method <code class="language-plaintext highlighter-rouge">A</code> from old class <code class="language-plaintext highlighter-rouge">my.Z</code> onto method <code class="language-plaintext highlighter-rouge">B</code> in new
class <code class="language-plaintext highlighter-rouge">my.Y</code></li>
<li>Adapt parameters from adapted method – a <code class="language-plaintext highlighter-rouge">Context</code>
used to be obtained from <code class="language-plaintext highlighter-rouge">ApplicationSingleton</code> but is now passed
in explicity</li>
<li>etc</li>
</ol>
<p>And then you would need a serialisation format and publishing
mechanism for these sets of transformations.</p>
<p>The closest I’ve seen to this is where Google actually did that in
the same target langauge. They <a href="https://blog.golang.org/introducing-gofix">published a tool with for the
language Go</a>, named
<a href="https://golang.org/cmd/fix/"><code class="language-plaintext highlighter-rouge">fix</code></a>. It automated upgrades of
existing code before 1.x was released, and since then, they’ve had
<a href="https://golang.org/doc/go1compat">the Go 1 compatibility document</a>.</p>
<p>Sadly, <code class="language-plaintext highlighter-rouge">fix</code> appears to have been mostly inactive since then?</p>
<p>I’m interested (academically as well as commercially) in producing
a tool which looked something similar, but much more widely applicable.</p>
<p>So having something that can take code/configuration, generate an
Abstract Syntax Tree (AST), and then apply a set of transformations.
Transformations compose. A large one might be <strong>Upgrade Framework
<code class="language-plaintext highlighter-rouge">n</code> to <code class="language-plaintext highlighter-rouge">n+1</code></strong> involvings lots of smaller transformations. For each
transformation, you’d need to query the AST for usages of the old
API, then try to apply the transformation which maps the old API
to the new API.</p>
<p>I’ve found <a href="https://dl.acm.org/doi/10.1145/1103845.1094832">one related paper</a>.
Given that it’s not gone further, was it too hard, or not viable, or
the wrong time?</p>
<h2 id="summary">Summary</h2>
<p>So I think this would be the next evolution in automated upgrades.
It’s seems like a big market – how many companies would pay for you
to solve this problem for them and allow them to concentrate on
business logic rather than plumbing concerns?</p>
<p>But I didn’t take the time off I planned to confirm the potential
market and see how hard a problem it would be solve :)</p>
<h2 id="further-reading">Further Reading</h2>
<p><strong>Update</strong> I originally published this without a list of references.
I should have done the hard work to include them. Mostly that meant
mining my browser history and Pinboard from February and March 2019
when I spent a chunk of time first looking at this
<strong>for absolutely no reason at all, clients of the time</strong>.</p>
<ul>
<li><a href="https://dl.acm.org/doi/10.1145/1103845.1094832">Refactoring support for class library migration</a></li>
<li><a href="https://ercim-news.ercim.eu/en88/special/automatic-upgrade-of-java-libraries">Automatic Upgrade of Java Libraries</a>
which linked to <a href="http://web.archive.org/web/20170409031849/http://kenai.com/projects/refactoringng">a defunct Netbeans plugin</a></li>
<li><a href="http://autorefactor.org/">Autorefactor</a></li>
<li><a href="http://walkmod.com/">Walkmod</a></li>
<li><a href="https://github.com/Netflix-Skunkworks/rewrite">Rewrite</a>
<ul>
<li>This has now become <a href="https://docs.openrewrite.org/">OpenRewrite</a>
and might be what I’m after, for Java and YAML at least. There is
an example which (when complete) is supposed to migrate from
Spring Boot 1.5.x to Spring Boot 2.x.</li>
</ul>
</li>
</ul>James AbleyOne of the many upsides I’ve had from working at lots of organisations is that you get to see what’s common. Are things like this everywhere? Frequently, the answer is yes!Organisation Cultures2018-04-14T20:29:31+00:002018-04-14T20:29:31+00:00https://jamesabley.com/organisation-cultures<p><em>If you’ve been here before, you might want to <a href="#the-stories">skip to the stories</a></em>.</p>
<p class="look-at-me"><span><q>All models are wrong but some are useful</q> – <a href="https://en.wikipedia.org/wiki/All_models_are_wrong#Quotations_of_George_Box">George Box</a></span></p>
<h1 id="context">Context</h1>
<p>The model in this case is <a href="http://qualitysafety.bmj.com/content/13/suppl_2/ii22">Professor Ron Westrum’s work on categorising organisational cultures</a>.
That’s quite readable for an academic paper. You might also
like <a href="https://continuousdelivery.com/implementing/culture/">the tl;dr version on Continuous Delivery</a>.</p>
<p>The abstract to the paper states:</p>
<blockquote>
<p>There is wide belief that organisational culture shapes many aspects of
performance, including safety. Yet proof of this relationship in a medical
context is hard to find. In contrast to human factors, whose contributions
are many and notable, culture’s impact remains a commonsense, rather than a
scientific, concept. The objectives of this paper are to show that
organisational culture bears a predictive relationship with safety and that
particular kinds of organisational culture improve safety, and to develop a
typology predictive of safety performance. Because information flow is both
influential and also indicative of other aspects of culture, it can be used
to predict how organisations or parts of them will behave when signs of
trouble arise. From case studies and some systematic research it appears that
information culture is indeed associated with error reporting and with
performance, including safety. Yet this relationship between culture and
safety requires more exploration before the connection can be considered
definitive</p>
</blockquote>
<p>Westrum’s research was focused on patient safety in medical units. This was
viewed as a metric of good performance.</p>
<p>He defined 3 patterns:</p>
<ol>
<li>Pathological – a preoccupation with personal power, needs, and glory</li>
<li>Bureaucratic – a preoccupation with rules, positions, and departmental turf</li>
<li>Generative – a concentration on the mission itself, as opposed to a concentration on persons or positions</li>
</ol>
<p>There are some interesting parallels between the medical context, and
commercial and public sector organisations.</p>
<h1 id="why-this-is-relevant">Why this is relevant</h1>
<blockquote>
<p>From case studies and some systematic research it appears that information
culture is indeed associated with error reporting and with performance</p>
</blockquote>
<p>That seems relevant to knowledge workers.</p>
<p>I’ve spent the last 6 years doing
<a href="https://definitionofdigital.com/">digital transformation</a>. That’s both in the public
and private sector, in the UK.</p>
<p>I’ve experienced all 3 types of patterns. I have a definite preference for
working in a generative way.</p>
<p>First, because it’s kinder to people. It’s more fun.</p>
<p>The other 2 can be exhausting and frustrating. That tends to lead to burnout,
and retention problems. That’s an expensive problem for an organisation to have.</p>
<p>In my experience, working in a generative culture leads to better outcomes. Do
you know of an organisation that wouldn’t like that (better/cheaper/faster)?</p>
<p class="look-at-me"><span><q>Remember: a toxic workplace is more likely to change you than you are likely to change a toxic workplace.</q> – <a href="https://twitter.com/Nikyatu/status/975402362360778753">Nikyatu</a></span></p>
<p>If our civic service workplaces are pathological or bureaucratic, then we’re
likely to burn-out the people trying to provide good services at a fair cost to
the population. Service delivery will fail, and vulnerable people can’t get
access to the services they need.</p>
<p>If our private sector companies are like this, failure tends to mean the
company goes out of business. That ought to be a thing that a company, the
employees, and the shareholders care about.</p>
<p>It can also be bad for the consumer. Fewer choices in the marketplace can lead
to monopoly and worse outcomes for the consumer.</p>
<p>So I made this little bit of the internet.</p>
<p>This is an index page for a collection of writings on this topic. It’s been <a href="https://twitter.com/jabley/status/823600749968031744">a long time coming</a>.</p>
<p>I hope it’s useful.</p>
<p class="look-at-me"><span><q>What didn’t you do to bury me but you forgot I was a seed.</q> – <a href="https://medium.com/@ashponders/on-buried-seeds-abd4d3ebba7a">Dinos Christianopoulos</a></span></p>
<h1 id="the-stories">The Stories</h1>
<ol>
<li><a href="/difficult-conversations/">Difficult conversations</a></li>
<li><a href="/working-where-youre-not-wanted/">Working where you’re not wanted</a></li>
</ol>James AbleyIf you’ve been here before, you might want to skip to the stories.Difficult conversations2018-04-14T20:29:31+00:002018-04-14T20:29:31+00:00https://jamesabley.com/difficult-conversations<p><em>This is part of <a href="/organisation-cultures/">the Organisation Cultures series</a></em>.</p>
<h1 id="a-case-study">A Case Study</h1>
<p><em>These numbers are for illustrative purposes only, with links to show that they
are comparable with what some companies have experienced</em>.</p>
<p>I am running a programme of work. <a href="http://www.thisismoney.co.uk/money/markets/article-2562201/Marks-Spencer-launches-long-awaited-150m-website.html">It takes £150 million and 3 years to complete</a>.</p>
<p>Is it successful?</p>
<p>What if I told you that the estimate was £156 million and 3 years?</p>
<p>Is it successful?</p>
<p>Does your answer alter if I add that that post-launch, <a href="http://www.cityam.com/1404807319/marks-spencer-website-nightmare-knocks-shares">the £1 billion
revenue through that channel dropped by 10%</a>?
And revenue through other channels wasn’t affected, so it wasn’t a change in
product. We have clear correlation and causation.</p>
<p>What about knowing that it took a further 2 years for revenue to get back to
previous levels?</p>
<p>Or that the time to change went up by 200%, or 12 weeks?</p>
<p>Or that the cost of change went up by 300%?</p>
<p>What if I told you that you could have spent £15 million getting a much better
thing in one year?</p>
<p>As a systems thinker, I prefer to have a wider view of a thing. And in this
example, the wider view is valid. It’s essential when you are defining how you
measure success.</p>
<p>In pathological organisations, I’ve seen what I can only describe as a
collective organisational delusion for things like the above example.</p>
<p>Adults were covering their ears and saying “la la la I can’t hear you”,
chanting “on-time and to-budget” as a mantra.</p>
<p>This wasn’t a good news culture, this was an “unable to face reality”
culture.</p>
<h1 id="advice">Advice</h1>
<p>You can try to do <a href="https://codeascraft.com/2012/05/22/blameless-postmortems/">a blameless postmortem</a> and apply the
<a href="http://retrospectivewiki.org/index.php?title=The_Prime_Directive">Retrospective Prime Directive</a>.</p>
<blockquote>
<p>Regardless of what we discover, we understand and truly believe that everyone
did the best job they could, given what they knew at the time, their skills
and abilities, the resources available, and the situation at hand.”</p>
</blockquote>
<p>No-one sets out to achieve bad outcomes. People definitely might make
sub-optimal decisions, because they aren’t equipped to know of an alternative.</p>
<p>But if people aren’t able to accept “OK, this is where we are, how do we
improve it?”, you’re not going to make any progress.</p>
<p>Don’t bother sticking around – leave – find somewhere more fertile, that’s a
kinder environment to work in.</p>James AbleyThis is part of the Organisation Cultures series.Working where you’re not wanted2018-04-14T20:29:31+00:002018-04-14T20:29:31+00:00https://jamesabley.com/working-where-youre-not-wanted<p><em>This is part of <a href="/organisation-cultures/">the Organisation Cultures series</a></em>.</p>
<p>Sparked by a too brief chat with <a href="https://twitter.com/ad_greenway">Andrew</a> at <a href="https://www.oneteamgov.uk/global/">OneTeamGov Global</a>.</p>
<h2>Generative organisations want you there</h2>
<p>They want people challenging existing systems, and improving things.</p>
<p>And that means it feels absolutely fantastic to work in those places.</p>James AbleyThis is part of the Organisation Cultures series.Open letter to a Google recruiter2017-08-06T07:08:31+00:002017-08-06T07:08:31+00:00https://jamesabley.com/open-letter-to-a-google-recruiter<p><em>This was a hard email to write, and I’m republishing it here since I think it’s an important issue.</em></p>
<p>Hi,</p>
<p>I have friends at Google. I think Google do really interesting work, and have lots of great people that I would love to work with. I think I would be able to fit in there, and help people succeed.</p>
<p>I’m reluctant to take this conversation further, for a few reasons:</p>
<ul>
<li><a href="https://www.theguardian.com/technology/2017/may/26/google-gender-discrimination-case-salary-records">https://www.theguardian.com/technology/2017/may/26/google-gender-discrimination-case-salary-records</a></li>
<li><a href="https://motherboard.vice.com/en_us/article/kzbm4a/employees-anti-diversity-manifesto-goes-internally-viral-at-google">https://motherboard.vice.com/en_us/article/kzbm4a/employees-anti-diversity-manifesto-goes-internally-viral-at-google</a></li>
</ul>
<p>These things do not give me a good impression of Google’s culture.</p>
<p>I’m lucky enough to be in the position where I can be more selective about where I spend my time.</p>
<p>I think I’d prefer to see how the company responds before we see whether it’s worth talking more.</p>
<p>All the best,</p>
<p>James</p>James AbleyThis was a hard email to write, and I’m republishing it here since I think it’s an important issue.On making active technology choices2017-03-07T19:03:42+00:002017-03-07T19:03:42+00:00https://jamesabley.com/on-making-active-technology-choices<p><em>This post was originally written as part of the styleguides repository for
Marks and Spencer Digital. That repository has since been removed from the
internet, which is a very separate, sad tale. The post was needed to prevent
technology choices being imposed on teams by The Architects, whilst at the same
time ensuring the technology estate didn’t become an uncontrolled sprawling
garden.</em></p>
<p><em>I’m republishing it here under Creative Commons BY-SA 4.0</em></p>
<h2 id="technology-choices">Technology choices</h2>
<p>These are both top-down, and bottom-up. We’ll follow something like the
<a href="https://www.gov.uk/service-manual/making-software/choosing-technology">Government Service Design Manual guide to Choosing Technology</a> (subject
to our continuous improvement tweaks!). The headline parts are that you
should:</p>
<ul>
<li>make explicit, active choices rather than sleepwalking into something</li>
<li>start off thinking about capabilities, rather than products/frameworks</li>
<li>be aware that you should be free to change your mind, and what that means</li>
</ul>
<p>Tech choices are bottom-up because:</p>
<ul>
<li>The team (engineers) should choose tools that they are productive with,
and enjoy using</li>
<li>Learning a new thing can be fun. Work <strong>should</strong> be fun</li>
<li>Heads of Engineering will not tell you to use language/framework X</li>
<li>Architects will not tell you to use language/framework X</li>
</ul>
<p>Tech choices are top-down because:</p>
<ul>
<li>We want to encourage rotations between teams, and people should not have
to learn entirely new things to do that</li>
<li>We want people in teams to be able to do on-call support for products
other than their own, so some commonality (in terms of operations manuals,
health checks, logging) is to expected</li>
<li>We need to be able to hire people to develop and maintain things</li>
<li>Bus factor is a thing. If you’re the only OCaml dev in the building, sorry!</li>
</ul>
<p>So esoteric tech choices will need a lot of justification.</p>
<h2 id="recording-decisions">Recording Decisions</h2>
<p>There is no monopoly on having good ideas. Anyone can have them,
regardless of title. For these ideas, we need ways of assessing ideas,
and measuring them. We do this with our products, making prototypes,
having hypotheses, and measuring data.</p>
<p>To help us make course corrections over time, we should:</p>
<ul>
<li>document these hypotheses</li>
<li>compare what we thought, whether we were right, and what we’ve learned
since then</li>
</ul>
<p>For technology and code, we’re going to use <a href="http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions">Architecture Decision Records
(ADRs)</a>. There are <a href="https://github.com/npryce/adr-tools">tools to help work with them</a>. If you use
Homebrew or Boxen, then it is very <a href="https://github.com/jabley/our-boxen/commit/cc7c7723820b29edbd7ef9eea5e14c3bc982d008">simple and easy to install these tools</a>.</p>
<p><a href="https://github.com/search?l=markdown&q=user%3Aalphagov+adr&type=Code&utf8=%E2%9C%93">Here are some examples of them elsewhere</a>.</p>
<p><a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br /><span xmlns:dct="http://purl.org/dc/terms/" property="dct:title">On making active technology choices</span> by <a xmlns:cc="http://creativecommons.org/ns#" href="https://jamesabley.com/on-making-active-technology-choices/" property="cc:attributionName" rel="cc:attributionURL">James Abley</a> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.<br />Based on a work at <a xmlns:dct="http://purl.org/dc/terms/" href="https://github.com/jabley/jamesabley.com/tree/gh-pages/_posts/2017-03-07-on-making-active-technology-choices.md" rel="dct:source">https://github.com/jabley/jamesabley.com/tree/gh-pages/_posts/2017-03-07-on-making-active-technology-choices.md</a>.</p>James AbleyThis post was originally written as part of the styleguides repository for Marks and Spencer Digital. That repository has since been removed from the internet, which is a very separate, sad tale. The post was needed to prevent technology choices being imposed on teams by The Architects, whilst at the same time ensuring the technology estate didn’t become an uncontrolled sprawling garden.Assessing coding tests2017-03-01T17:00:03+00:002017-03-01T17:00:03+00:00https://jamesabley.com/assessing-coding-tests<p>A bit of internet that I can point to when I have a conversation with people
about coding tests.</p>
<p><strong>Do not blindly copy this</strong></p>
<p>You should adapt it to your context. You should also ensure that the
instructions to the candidate tell them what’s important, and what you’ll be
looking for. Otherwise you’re moving the goalposts, and that’s unfair.</p>
<p>Personally, I’m a big fan of Test-Driven Design, but I won’t always use it. In
some contexts, I’m just throwing some shit together to learn more about the
problem. I’ll take all the shortcuts I can to get to that learning as fast as
possible.</p>
<p>But for a coding test, you’re probably after getting the candidate to
demonstrate the things you care about in production code.</p>
<h2 id="functionality">Functionality</h2>
<ul>
<li>Works with the sample provided?</li>
<li>Works with other valid test cases?</li>
<li>Deals sensibly with incorrect input?</li>
</ul>
<h2 id="style">Style</h2>
<ul>
<li>Readable?</li>
<li>Idiomatic use of language and language features?</li>
<li>Good use of OO and/or functional constructs?</li>
<li>Consistent naming?</li>
<li>Well & consistently formatted, good use of whitespace?</li>
<li>No inappropriate hard coding?</li>
</ul>
<h2 id="structure">Structure</h2>
<ul>
<li>Broken into pieces?</li>
<li>Separation of concerns?</li>
<li>Single responsibility (and SOLID in general)?</li>
<li>Loose coupling?</li>
<li>Domain driven?</li>
</ul>
<h2 id="tests">Tests</h2>
<ul>
<li>Has tests?</li>
<li>Test driven?</li>
<li>Unit tests vs. integration tests?
<ul>
<li>Appropriate use of mocking?</li>
</ul>
</li>
</ul>
<h2 id="extras">Extras</h2>
<ul>
<li>Has build script?
<ul>
<li>Extra points for Vagrant or Docker!</li>
</ul>
</li>
<li>Command line invokable?</li>
<li>Use of version control?
<ul>
<li>Small, frequent commits?</li>
<li>Commit messages tell a narrative?</li>
</ul>
</li>
</ul>James AbleyA bit of internet that I can point to when I have a conversation with people about coding tests.Devops Defined2017-02-07T15:57:00+00:002017-02-07T15:57:00+00:00https://jamesabley.com/devops-defined<p>It’s 2017 and we’re still having this discussion? Alright then, here are some
definitions for your consideration…</p>
<h2 id="top-definition">Top Definition</h2>
<p>DevOps (and latterly Devops) is not a role. It’s not a team.</p>
<p>If you advertise for Devops engineer, or Head of Devops, then either you’re
doing:</p>
<ul>
<li>keyword stuffing to reflect the language that your potential audience
might use. This can be a recruiting hack done by SEO-aware people,
begrudgingly</li>
<li>it very wrong, for the reasons to follow</li>
</ul>
<p class="look-at-me"><span><q>DevOps (a clipped compound of “software DEVelopment” and “information
technology OPerationS”) is a term used to refer to a set of practices that
emphasise the collaboration and communication of both software developers and
information technology (IT) professionals while automating the process of
software delivery and infrastructure changes. It aims at establishing a culture
and environment, where building, testing, and releasing software can happen
rapidly, frequently, and more reliably.</q> – <a href="https://en.wikipedia.org/wiki/DevOps">Wikipedia</a></span></p>
<p>Culture.</p>
<p>Set of practices.</p>
<p>Collaboration and communication.</p>
<p>Not tools.</p>
<p>Not a job description.</p>
<p>Not a role.</p>
<p>Not a team.</p>
<p>My friend (and former colleague<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>) Gareth Rushgrove wrote some considered
thoughts on this topic as part of the
<a href="https://www.gov.uk/service-manual">Government Service Design Manual</a>.
That has since evolved and no longer contains the section on Devops, but it
<a href="https://github.com/gds-attic/government-service-design-manual/blob/95c9ad99bfcb724297b6c82691b4610919be21da/service-manual/operations/devops.md">remains available in the archive</a>.</p>
<p class="look-at-me"><span><q>If your job advertisement says “devops engineer,” I guarantee you are
alienating the very people you want to hire.</q> – <a href="https://twitter.com/nickstenning/status/463431661984956416">Nick Stenning</a><span></span></span></p>
<p>An often-made comparison is to take agile. Agile is a way of working. You
wouldn’t say you were going to hire an agile, so how can you possibly hire a
Devops?</p>
<p>A rejoinder to that is to say that you can have agile engineers. I would say a
strong “No!” to that position. Engineers familiar with agile ways of working?
Sure.</p>
<p>If you are hiring someone to join your Devops team, I know you have at least 2
problems:</p>
<ol>
<li>You think Devops is a role/team</li>
<li>You think a single team should be responsible for:
<ul>
<li>culture</li>
<li>evolving practices</li>
<li>improving communication</li>
<li>improving feedback loops</li>
</ul>
<p>rather than those being concerns that should cut across the entire organisation<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.</p>
</li>
</ol>
<p>That screams siloed organisation to me. They’re fun, them<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>.</p>
<h2 id="2-devops">2 Devops</h2>
<p>To do a bad thing, by ssh-ing onto a server and doing <code class="language-plaintext highlighter-rouge">sudo vi some-file</code> to fix
the problem. Sometimes done during emergencies to restore normal operation to
some IT system.</p>
<p><strong>I Devopsed the shit out of it</strong></p>
<h2 id="3-devops">3 Devops</h2>
<p>To break all the things at the same time, thanks to the joys of having automated
large parts of your job. It’s now even easier to break everything, rather than
just a single instance of a thing. We can write code which updates all our
servers at the same time, rather than just updating them one at a time.</p>
<p><strong>Oh crap, I Devopsed it</strong></p>
<h2 id="who-cares">Who cares?</h2>
<p>It could be argued that language evolves and the meaning has changed. Maybe
Devops has been co-opted now, much like there used to be the distinction between
cracking and hacking. These days, no-one really uses the former any more, and
the latter has subsumed the meaning of the former.</p>
<p>Well, I care. I’ve seen too many organisations not appreciate the subtleties of
looking at flow, feedback loops, and continuous improvement. That’s why this
matters.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Thought you’d appreciate that, Mazz and Josh ;) <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p><a href="/whats-wrong-with-bimodal-it/">My position on Bi-modal IT hasn’t changed</a> <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>Post to follow on that topic <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>James AbleyIt’s 2017 and we’re still having this discussion? Alright then, here are some definitions for your consideration…Distributed Array of Commodity Hosting Services (DACHS)2016-10-02T15:29:37+00:002016-10-02T15:29:37+00:00https://jamesabley.com/distributed-array-of-commodity-hosting-services<p class="look-at-me"><span><q>not [suitable for] enterprise production use where an
enterprise such as ourselves would want an SLA.</q> – anonymous Architect</span></p>
<p>The above quote shocked me when I was describing to some people how a particular
solution worked. The ideas that it contains don’t seem to be as well-known
as I thought. This post is an attempt to do something about that.</p>
<p>This approach for running internet services is a logical step from the
<a href="http://research.google.com/pubs/pub35290.html">Google research paper “The Datacenter as a computer”</a>. I’m not sure it has a
name, or is documented anywhere else. It’s an approach I’ve used to reduce hosting
costs by a factor of 1,000 for a recent project.</p>
<blockquote>
<p>Our central point is that the datacenters powering many of today’s successful Internet services
are no longer simply a miscellaneous collection of machines co-located in a facility and wired up
together. The software running on these systems, such as Gmail or Web search services, execute
at a scale far beyond a single machine or a single rack: they run on no smaller a unit than clusters
of hundreds to thousands of individual servers. Therefore, the machine, the computer, is this large
cluster or aggregation of servers itself and needs to be considered as a single computing unit.</p>
</blockquote>
<p>Rather than treating a datacenter as a computer, we can treat any component at
any level as a composable abstraction.</p>
<p>It came about after discussions with <a href="https://twitter.com/jpluscplusm">Jonathan</a> and <a href="https://twitter.com/adamwright">Adam</a>. We wanted
to package commodity services. By doing so, we thought we could <a href="#fn:1">make something
greater than the sum of the parts</a><sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>. It seemed to provide a better
level of service than any of them would offer in isolation. And so we conceived
of DACHS<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.</p>
<h2 id="starting-point">Starting point</h2>
<p>We have some Platform as a Service (PaaS) that we want to run our application
on. We need to run 100 instances of our application to handle the expected
load. We’ll call this zone A.</p>
<p>The zone A PaaS offers a Service Level Agreement (SLA) of 95% availability<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>.</p>
<table>
<tr>
<th>PaaS cost (%)</th>
<th>Availability (%)</th>
</tr>
<tr>
<td>100</td>
<td>95%</td>
</tr>
</table>
<p>How can we improve on that?</p>
<h2 id="add-a-thing-in-front-of-it">Add a thing in front of it</h2>
<p>Remember the <a href="https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering">fundamental theorem of software engineering</a>:</p>
<blockquote>
<p>We can solve any problem by adding a layer of indirection</p>
</blockquote>
<p>First we put in another layer in front of our PaaS. This layer offers an SLA of
99.99% availability. This layer routes traffic to our stateless application
running in the PaaS.</p>
<table>
<tr>
<th>PaaS cost (%)</th>
<th>Availability (%)</th>
</tr>
<tr>
<td>100</td>
<td>94.99</td>
</tr>
</table>
<p>This has actually made things worse. The <a href="#fn:1">serial availability equation</a> tells us
how to calculate availability for this case.</p>
<p><img src="/images/dachs/serial.svg" alt="Availability of Serial Components" title="Availability of Serial Components" /></p>
<p>It’s the product of all the component’s availability in the network path. So
that would be <code class="language-plaintext highlighter-rouge">0.9999 * 0.95 = 94.9905%</code> availability.</p>
<p>We’re also now paying for another thing – the routing layer.</p>
<h2 id="cope-with-single-zone-failure">Cope with single zone failure</h2>
<p>We next add a similar PaaS, with similar SLAs. We deploy our app into that, and
update our routing layer to send traffic to this new zone B as well. The
benefits of making this change include we:</p>
<ul>
<li>can cope with an outage happening in either zone; the other zone will just soak up
the traffic</li>
<li>have increased our availability</li>
</ul>
<p>We can now use the <a href="#fn:1">parallel availability equation</a>.</p>
<p><img src="/images/dachs/parallel.svg" alt="Availability of Parallel Components" title="Availability of Parallel Components" /></p>
<p>The combined availability of zones A and B is <code class="language-plaintext highlighter-rouge">1 - [(1 - 0.95) * (1 - 0.95)] = 0.9975</code>.</p>
<p>But we still have the routing layer at 99.99%. We need to combine those numbers
using the the serial availability equation again. That gives us
<code class="language-plaintext highlighter-rouge">0.9999 * 0.9975 = 99.740025%</code>. By running 2 zones in parallel, we get a big
boost above the basic 95% SLA that either PaaS provider used individually
would offer.</p>
<p>It might be that both zones have an outage at the same time, as part of their
95% SLA. We can make choices around which provider to use to try to avoid that.</p>
<p>For example, Azure, AWS, and Google are unlikely to all have problems at the
same time. If they do, something really bad has happened.</p>
<p>Unfortunately, we’ve doubled our costs in a naive implementation. We’re now
paying for 100 instances in both zones.</p>
<table>
<tr>
<th>PaaS cost (%)</th>
<th>Availability (%)</th>
</tr>
<tr>
<td>200</td>
<td>99.74%</td>
</tr>
</table>
<p>Elastic scaling could help with that cost aspect. There are risks that if other
people have architected things in this way. Everyone could end up trying to
autoscale in the same zone! You should talk to a relevant stakeholder about how
much they care whether you solve that problem.</p>
<h2 id="cope-with-multiple-zone-failures">Cope with multiple zone failures</h2>
<p>We decide to add some more PaaS providers. We now have zones C, D, and E, and we
get to make other decisions.</p>
<p><img src="/images/dachs/multiple-zones.svg" alt="Availability of multiple parallel components" title="Availability of multiple parallel components" /></p>
<p>How many zone failures do we need to be resilient against? If the answer is just
1, then we only need to put 25 instances of our application in each zone,
and we’re good. We’d be paying for <code class="language-plaintext highlighter-rouge">5 * 25 = 125</code> instances of the application
running across the 5 zones. We’d still have 100 available to serve customers
if a single zone failed.</p>
<table>
<tr>
<th>PaaS cost (%)</th>
<th>Availability (%)</th>
</tr>
<tr>
<td>125</td>
<td>99.99%</td>
</tr>
</table>
<p>We get to make choices for operational complexity and cost about what degree of
resilience we want. Handling 2 zones being down might be an acceptable outcome
in your context. In that case, we’d have:</p>
<ul>
<li>34 instances running in each zone</li>
<li>102 instances still serving customers if 2 zones failed at the same time</li>
</ul>
<table>
<tr>
<th>PaaS cost (%)</th>
<th>Availability (%)</th>
</tr>
<tr>
<td>170</td>
<td>99.99%</td>
</tr>
</table>
<h2 id="profit">Profit!</h2>
<p>So I hope this shows that by packaging commodity components together, you can
achieve the SLAs otherwise obtained by driving large trucks of money to large
companies.</p>
<p>I also made <a href="https://www.youtube.com/watch?v=Pfvj5dLeDEA">a short video which demonstrates this approach</a>.</p>
<h3 id="notes">Notes</h3>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>There’s a theoretical basis for this as well. <a href="https://books.google.co.uk/books?id=wUyDF1yWfhMC&lpg=PA28&ots=iFJ0ivnIF7&dq=availability%20in%20parallel&pg=PA23#v=onepage&q=availability%20in%20parallel&f=false">The parallel and serial availability equations</a> are well-known within the electronics and network engineering communities. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>The name is completely Jonathan’s fault. He’s a (cat and) dachshund person. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>Some don’t seem to offer any at all. If you read
<a href="https://run.pivotal.io/policies/terms-of-service/">Pivotal Web Services terms</a>,we see:</p>
<blockquote>
<p>11.1 Limitation of Liability. TO THE MAXIMUM EXTENT MANDATED BY LAW, IN NO EVENT WILL WE OR OUR LICENSORS BE LIABLE FOR ANY LOST PROFITS OR BUSINESS OPPORTUNITIES, LOSS OF USE OF THE SERVICE OFFERINGS, LOSS OF REVENUE, LOSS OF GOODWILL, BUSINESS INTERRUPTION, LOSS OF DATA; OR ANY INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, UNDER ANY THEORY OF LIABILITY, AND WHETHER BASED IN CONTRACT, TORT, NEGLIGENCE, PRODUCT LIABILITY, OR OTHERWISE. IN ADDITION, OUR AND OUR LICENSORS’ LIABILITY UNDER THIS AGREEMENT WILL NOT, IN ANY EVENT, REGARDLESS OF WHETHER THE CLAIM IS BASED IN CONTRACT, TORT, STRICT LIABILITY, OR OTHERWISE, EXCEED THE AGGREGATE FEES YOU PAID TO US FOR THE SERVICE OFFERINGS IN THE TWELVE (12) MONTHS PRIOR TO THE EVENT GIVING RISE TO YOUR CLAIM REGARDLESS OF WHETHER WE OR OUR LICENSORS OR SERVICE PROVIDERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES AND REGARDLESS OF WHETHER ANY REMEDY FAILS OF ITS ESSENTIAL PURPOSE. BECAUSE SOME JURISDICTIONS DO NOT ALLOW ALL OR SOME OF THE FOREGOING EXCLUSIONS OR LIMITATIONS OF LIABILITY, THE PRECEDING LIMITATION MAY NOT APPLY TO YOU.</p>
</blockquote>
<p><a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>James Ableynot [suitable for] enterprise production use where an enterprise such as ourselves would want an SLA. – anonymous ArchitectTypes of Engineer2016-08-02T07:35:12+00:002016-08-02T07:35:12+00:00https://jamesabley.com/types-of-engineer<p>Software engineering is a broad field, with many different specialisms. I tend to consider 4 distinct roles. None of these roles are more senior than another. These are explicitly specialisms, not a hierarchy. There is no right way to be an engineer. You can be one or more of these things, with more thrown in.</p>
<h2 id="backend-engineer">Backend engineer</h2>
<p>Backend engineers create APIs for others to consume. Typically strong domain modellers, they are up-to-date with latest thinking about REST, binary protocols, API chattiness, and data storage.</p>
<h2 id="frontend-engineer">Frontend engineer</h2>
<p>Frontend engineers craft user interfaces. Good at breaking down interfaces into semantic components, removing cruft and using design elements to create a smooth experience, with deliberate friction where a user should stop and spend more time. HTML and CSS are their main toys, with a bit of JavaScript thrown in where necessary. But progressive enhancement and accessibility are their mantra.</p>
<h2 id="operations-engineer">Operations engineer</h2>
<p>Tools for others, automation, infrastructure as code, PaaS usage, data storage.</p>
<h2 id="test-engineer">Test engineer</h2>
<p>The best testers are sick people. Why would you want to use an API like that? But they will anyway, and help you create a better, more secure product for your users.</p>
<h2 id="beware-of-labels">Beware of labels</h2>
<p>There are other things outside of these. In some organisations, they may be a distinct role. For others, they’ll be a hat that someone wears on occasion.</p>
<p>The boundaries are also fluid. I used to be a backend engineer, became more frontend as I did mobile web stuff, then back to backend, and finally operations.</p>James AbleySoftware engineering is a broad field, with many different specialisms. I tend to consider 4 distinct roles. None of these roles are more senior than another. These are explicitly specialisms, not a hierarchy. There is no right way to be an engineer. You can be one or more of these things, with more thrown in.