Value of Data

“Big data is the future” or so we are told. With enough data we can create models that provide us with good outputs for unknown inputs using an array of techniques like: using probabilities to estimate likely relationships; regression to find trends and interpolate answers; or by training general purpose learning algorithms.

In particular, Machine Learning (ML) is in vogue, and although the underpinning concepts aren’t new (I had assignments combining computer vision and artificial neural networks at university back in 2001), the capabilities of machines and easy access to massive levels of computing power now allow much more practical application of these concepts.

Regardless of the technology or the hype, there are universal concepts that are paramount to the successful application of a technology. For instance it is important to understand what the technology can and can’t do, and what properties are intrinsic and what are variable. Continuing with ML as an example, it is more effective to pre-process an image and extract key attributes and feed those into a neural network than to give it a million pixels per data entry.

One universal concept is that the technology needs to solve a real problem, or to use business terms, needs to ‘add value’. There is a cost to using a technology – for big data, collecting data can be expensive, notably in mitigating the risk of failing to manage the data i.e. ensuring it is secure and compliant. To offset this cost we need to establish value, which means asking:

  • How does having this give us a competitive advantage?
  • How can I monetize this?

For some of the big and famous organizations the answers to these are fairly clear: Amazon wants shopping data to provide better choices than competitors, drawing more customers and therefore more sales; Google and Facebook want information that targets their adverts to more of the right people, resulting in more buying per advert, incentivizing customers to buy more adverts.

One strategy for answering these questions is to create data which is so much better than competitors’ data, that customers will pay to access the data. This is not a new concept as software products have been up-selling reporting since time immemorial, but recently there seems to be more inclination to answer modelling questions rather than just provide charts. This is where the business questions need to be applied. For instance, if it is possible to mine data to answer questions like “what impact does doing X have on Y”, then ask yourself whether these answers are something that customers will pay for and competitors don’t have. If so, then you’re onto an excellent strategy. If not, then is having that data valuable?

Automated Testing Priorities

There’s a common theme in automated software testing that the greatest effort should go into unit tests, lesser into integration tests, and least into UI tests. This is known as the testing pyramid. However I’m not convinced this is the best use of automated test developers for web applications, and I believe this is because the nature of web standards and a commonly used web application architecture invalidate some of the assumptions behind the pyramid.

Testing Goals

Before we continue we need to state the goals of automated testing. In my mind they are:

  1. Validate the correctness of the thing under test
  2. Allow refactoring of the thing under test

Note that correctness includes ‘non-functional’ things, like authorization security.

From a business perspective, we want to know that the product works as intended. Working as intended means that the application performs correctly when used through its interfaces. This would suggest that UI tests are the most important, which is the opposite of conventional wisdom.

The reason often given for placing less focus on UI tests is that they have been considered notoriously fragile. However I posit that this has been due to the nature of the interfaces being tested, which have tended to make identifying and interacting with UI elements automatically very hard; e.g. having to use control ids with WinForms applications. I’m also suspicious that less focus on UI tests is a consequence of Agile methodologies that insist on jamming all testing into the same cycle as development, resulting in automation trying to be written against a UI in an extreme state of flux.

Unit Test Fragility

One problem I have with unit testing is that developers are encouraged to test the internals of the unit. This happens when mock objects are checked to see if certain methods were called on the mock.

The purpose of functions and classes are that they expose some contract and hide the details of how that contract is fulfilled. Testing how a unit is doing its work means examining inside the black box, which defeats the purpose of using testing to support refactoring because now we can’t make a change to the implementation of a unit without breaking its tests.

UI Test Fragility

In his 2012 article Fowler says:

An enhancement to the system can easily end up breaking lots of such tests, which then have to be re-recorded. You can reduce this problem by abandoning record-playback tools, but that makes the tests harder to write. Even with good practices on writing them, end-to-end tests are more prone to non-determinism problems, which can undermine trust in them. In short, tests that run end-to-end through the UI are: brittle, expensive to write, and time consuming to run.

I believe that some of these assumptions are less valid in modern web test automation.

Automated web testing tends to be hand-written because (in my experience) the recording tools can create quite fragile paths, usually because they don’t know what the least variant information is. It is straight-forward to hand-write UI tests thanks to CSS selectors which are easy to use, well-supported, and when done simply (i.e. via id and class selectors rather than paths) aren’t hugely prone to change. These selectors are usually wrapped into page objects that further insulate the tests from changes.

The HTML DOM also exposes an event model which allows tests to mimic the vast the majority of UI actions, removing the complexity of older style tools which involved a lot of mouse-coordinates and button states.

And finally, in web development, UI testing has the added benefit of enabling testing across multiple browsers – something less applicable to downloaded applications.

However I agree that they remain time-consuming to run, and if there are lots of developers committing to the repository then having your continuous integration run on every commit may not be possible, reducing the benefit of the tests for quickly catching problems.

Middle-Ground – Integration Testing the API

It is increasingly common for web applications to be built as a web API and a web (JavaScript) client. This is my personal preference over server-side rendering as it nicely decouples the presentation from the logic and allows the application to more easily integrate with other applications. There is some development overhead in this approach, but given most web pages perform some dynamic interaction with the server thus requiring some level of client richness, this overhead is quite minimal.

Having an API provides an excellent place for automated testing. An API is a contract and will express most, if not all, of the business rules through its inputs and outputs. It also requires basic security, and allows validation and authorization to be checked. It can be easily extended to run more extensive security testing (i.e. by manipulating HTTP headers and sending malicious data) and performance tests.

Integration testing the API doesn’t mean a full environmental setup is required. It is still reasonable to use mocks for calls that are slow or resources that aren’t available. For instance my integration tests use .NET Core’s TestServer rather than running a web server, EF’s UseInMemoryDatabase rather than instantiating a database, and stub out AWS service calls. These are reasonable compromises because I’m confident those areas will perform to their contracts.


This is my ‘testing pyramid’ from highest to lowest priority:

  1. API integration tests
  2. Integration or unit tests for things that can’t be reached from the API
  3. UI tests for success scenarios

In my current application I have 98% code coverage and 90% branch coverage (largely ruined by not throwing exceptions inside all the C# using statements) of my business layer using the first two items on my list, and it has enabled considerable evolution and refactoring of the code-base over the last six months.

Generalists and Specialists

While I’m on my theme of people value, there is a group of technology professionals who are often quite undervalued – the generalists. Until the last couple of years there had been a trend of increasing specialization of technology development roles, notably the back-end/front-end split which is now being replaced again by that great generalist role, the full-stack developer. And here’s the thing – overall, the cost of generalists and specialists doesn’t vary heavily, with subject specialists – e.g. InfoSec, DBAs, Authentication Architects (I saw that one advertised in Wellington) – costing a bit more, and platform specialists – e.g. Java dev, UI dev – a little less. In this continuum of generalists to specialists, generalists represent an absolute bargain.

The Generalist Developer

An experienced developer can do 90% of what a specialist can do in their field.

Need DevOps? Why not get your developer to do it? They can research potential solutions, read and understand API documentation, pickup Bash or Powershell pretty quickly, and setup basic configurations based on recommended best-practice from vendors. Plus when they’re done, they can go back to development rather than twiddling their thumbs.

Need QA automation? Need requirements analysis? Need basic network setup? Need Project management? Need customer support? Need internal IT? Need architecture? Need a DBA? These are all things I’ve done to a production level in my 14 years primarily as a ‘developer’.

The vast majority of software out there is about understanding, automating, and transforming processes, and generalists are amply qualified to solve these problems. And where they can’t solve a problem from experience, they are expected to go out and research the plethora of alternatives, running a huge gamut of potential technologies (and therefore specializations), and pick a solution.
Sure, they may not create the algorithm that founds the next Google, but those companies represent a minuscule segment of the field and require a level of specialization more likely found in academia than in industry anyway.

In software product development you want generalists. These are people who know that, for instance, information security is important so they pick technologies, tools, and solutions that promote good security practice. And because they’re not as sure of themselves they are more likely to test and verify their assumptions and designs. They also have a wide view of the world, so can much more effectively evaluate trade-offs between different solutions and solve a wider range of problems than a specialist can. And typically, for little difference in price!

The Specialist

I’m not suggesting we don’t need specialists at all. I’ve found their value to be acting in advisory or consultancy roles where they are the checks-and-balances that warn people about traps and pitfalls and educate the generalists on best practices. I freely acknowledge I wouldn’t have some of the knowledge I have today without the support specialists have been able to provide.

However this very act of education decreases the relative value of the specialist because, by receiving more knowledge the generalists ‘level-up’ and reduce their knowledge gap in the specialist’s field. That makes the need for the specialist more tenuous, and some people find it challenging to overcome the instinct to protect one’s space. This assumes that specialists are static creatures, and I would expect they too are continually learning and trying to level-up, but within one organization the usefulness of their knowledge may be limited.

Another problem with specialists in a small organization, is that they effectively constrain your solutions. The company thinks, “well, I’ve got an Oracle DBA so we’d better use Oracle and stored procedures” even if it’s not the best solution. Whereas, a generalist will evaluate a range of solutions based on their experience, the company’s accumulated knowledge and environment, industry trends, future hiring needs, relative costs, etc. etc. to inform the solution choice.


If you’re a five person development shop, a specialist doesn’t make sense. If you need that expertise you should hire a consultant. If you’re a five hundred or five thousand person development enterprise, then those specialists should have plenty to do to make it worth having them on the payroll.

Perverse Incentives


It is well established that CEO pay has climbed astronomically in the last decades, and that over a similar period inequality has grown throughout the Western world. This period has also been dominated by laissez faire economics, and corporatism to the extent of corporate welfare.

There are some indications that we are swinging away from this setting, with even the IMF saying, “Empirical evidence suggests that it may be possible to increase [personal income tax] progressivity without adversely affecting economic growth” (IMF Fiscal Monitor, October 2017), but it will be some time, if ever, before the mind-set of actors in this economic dance changes.

Right now, you might be wondering whether this is a software blog or some economic philosophy one, but be assured, this relates back to management, and particularly how it impacts high-skill industries such as software development.


Our current economic setting has created a system of incentives that is at odds with the goals of good management.

Good management in knowledge industries emphasises an ‘upside-down-pyramid’ of management which supports, rather than directs, the activities of skilled front-line executors. Put another way, it genuinely recognizes that people are the most important asset in a business and the role of management is to create an environment where those people with the skills can excel.

It is also clear that managers can add value in ways that others can’t. They can use their outright authority to connect otherwise separate groups, resolve conflicts, and use their bird’s-eye-view of the business to sponsor change and ensure strategic alignment, a.k.a. ensuring the right work is being done (HBR, 1995).

In our society we equate value with money, and given the greater value managers can add, pay managers more. We also expect more return from more money, so we expect increased responsibility and accountability from said managers. But here we reach the crux of problem: to support skilled staff it is important to empower them with responsibility, so how can the manager be held accountable whilst giving away the responsibility? To readers who (like myself) have been managers this is the “no kidding” moment, so I would ask the question, how have you tried to change that? This is a systemic problem and as far as I can tell our approach has been grudging acceptance of the status-quo.

How strong is that responsibility?

Good managers empower people, and people make mistakes, and it is unfair to hold the manager responsible for those mistakes, otherwise everyone would be too afraid to make mistakes and we’d destroy effectiveness and any hope of innovation. We also read in the media stories of obfuscating CEOs who (quite reasonably) admit they couldn’t have known that X was happening in their organization and so (will make some knee-jerk changed to resolve it, and) can’t really be held responsible.

By highlighting that they’re ‘not really that responsible’ the premise that with increased value comes increased responsibility has been completely undermined. Now this isn’t the only reason managers are paid more: The other commonly held notion is that managers add more value because their decisions have greater (financial) consequences. This I dispute largely because those decisions are never made in a vacuum, and there are a lot of advisors and analysts that effectively determine the decisions in the first place, with the executive being closer to a rubber stamp than a judge. But that would be a digression from the point of this post, which is to focus on the consequences of ‘having responsibility’.


Responsibility encourages micromanagement. When your head is on the line then your instinct is to get as involved as possible to ensure the outcome you want. There are plenty of good managers out there who manage to overcome this instinct, but the system is very much setup to encourage micromanagement, and that destroys empowerment.

Under micromanagement, instead of having a team who are comfortable to pursue their own direction to achieve the goals, you’ve now got a sort-of team member (the manager) with a much more limited view of the details (after all, the higher up an organization you are the wider but shallower view you’re expected to have) who ironically requires more energy from the team-members to ‘manage’. This also makes the team feel less ownership for their work because accountability becomes blurred between the manager and the team. And instead of being measured by outcomes the team are judged on details; details that the manager is often not as qualified (as they think they are) to judge.

Micromanagers can also become the de-facto communication channel instead of the more effective approach of organizing teams to communicate with each other. This creates a knowledge and communication bottle-neck which is inefficient.

What does an effective manager do? They set a vision and goals, empower people to execute on them, provide cover so they can achieve their goals, resolve conflicts, and then put their feet up or have strategy off-sites or something like that. They should not be required to answer all the questions other people in the organization have – the team is the most capable to do that – but they can filter the questions to ensure the team can focus on the work.

But if your organization insists on you knowing everything because you are ‘responsible’ because you are expensive, then how can you be an effective manager?


So how do we fix this?

Essentially, be realistic with managers roles. If their roles are seen to be closer to text-book roles of planning, leadership, organization, and control rather than a know-it-all of everything in their fiefdom, then responsibility is going to fall closer to where it is held, with the executing team.

With this reduction in perceived responsibility and expectation there should be a reduction in compensation.

This will also improve empowerment among the teams which will give them a greater sense of ownership and satisfaction, meaning they’re less likely to turn-over.


This is something of a digression from previous ‘how-to’ posts. Instead I’ve felt motivated to share my perspective on leadership, which is an issue in society that impacts organizations of all sizes and kinds, from parenting through to corporations and government.

What is Leadership

I’ve often been struck by the distance between leadership as it is defined in management texts and how it is executed. My trusty management textbook places leadership as the fourth pillar of management in the section “Leading – To Inspire Effort”, and defines leadership as “the process of inspiring others to work hard to accomplish important tasks” (Schermerhorn 2001, p262). This is a fairly open definition that could include anything from managing an empowered self-managing team through to slavery. To be fair to the author, it is followed by six chapters expanding on the subject.

The contents of that text are based on the outcome of decades of research and analysis. In general, research seeks to simplify the thing under study as much as possible – to be the crucible that burns away insignificance and leaves us with the key factors that impact something. I believe concise leadership contingency models like Hersey-Blanchard and it’s three-dimensional matrix of relationship and task behavior and follower readiness illustrate how complicated systems can be abstracted to their significant details, and that such models are critical for illuminating various facets of management and preparing managers to handle the many different people and situations they may encounter (and to be clear: leaders are managers. If you are inspiring people then you are managing them).

What scientific management seems to cover less (or at least less so in introductory textbooks) is that people are.. well.. people. They aren’t ‘rational agents’ conforming to the box neatly defined by research, and they have – insert-deity-here forbid – feelings! People are squishy and unpredictable, and frankly if you’re in a management position and don’t think that I’m stating the bleeding obvious, then you need to find another job. The literature on this side of management tends to be more anecdotal, but also easy to empathize with regardless of which side of the managing/managed fence you fall on.

Theory vs Practice

So now I will add my anecdotes through some hypothesizing. I’m told (by Wikipedia) that around 1% of the population suffers from the most extreme form of narcissism, and it is my contention that these people tend to cluster in leadership roles. The very nature of “knowing you’re right” and projecting that confidence (however un-examined it may be) creates the vision that management theory looks for. It also creates an environment that followers need to have a sense of fulfillment – after all in our comfortable post-Maslow worlds we need to make a difference to find satisfaction, and what better way than fulfilling a vision to ‘achieve great change/improvement/innovation/etc’. The people who espouse this confidence are also lauded by their superiors who naturally prefer supposed simplicity over the complex reality of the situation, and thus these people tend to elevate into positions of power. Unfortunately people who ‘know’ they’re right also tend to be extremely resistant to anything that challenges their perspective. Such a conflict can be very personal and highly destructive given that any challenge is perceived as a threat to that person’s self-image or core values.

In practice the leader’s vision tends to be skewed towards their own goals, and while organizational alignment is usually covered by at least lip service, the goals tend to be angled towards their individual needs, whether for career progression (who hires the manager who thought the status quo was working great and opted not to change anything?) or a psychological need (e.g. admiration, entitlement).

This is the point where I start to struggle with these people. I believe I’m experienced enough to be positive about people and work with them to foster the goals of the relevant organization, but my natural desire for analysis means that over time I tend to find concerning dissonance in their positions. Where I’m not experienced enough, or perhaps just disinclined to submit to this aspect of culture, is that I will point out that dissonance, and in doing so create the conflict.

The world of management theory tells us to be transparent with problems because organizations can’t fix problems they can’t see, and it tells us managers that a moderate level of conflict is good (too little means people have stopped caring and are probably looking for other jobs). What it doesn’t tell us is that some of the time the manager is going to see that as a personal threat, or they’re going to ignore it and place you in the ‘whiner’ box, because these managers aren’t entirely cut out for their jobs, but there is seldom any way to observe this problem and correct it in an organization. Studies strongly indicate that the most significant factor in employee retention is their immediate manager, and yet dysfunction in that relationship is often invisible to the organization until it is too late.

We know what’s good for us, but…

Perhaps the most scientific expression of this I’ve run into is in the book Good to Great by Jim Collins. Chapter 3 very clearly summarizes that the best leaders have the opposite traits to narcissists. They are modest and under-stated, are diligent and workman-like, and they give credit where it is due but shield others from failure (Collins 2001, p39). This doesn’t stop them having a firm vision and a strong will to achieve it, but they do so by getting the right people and getting them to buy into the vision and steer the organization toward it, and expecting they’ll do the same at the next level of the organization. It is a positive and virtuous cycle if achieved.

The literature also highlights how salary and performance of top leaders correlate negatively. And yet this need for ‘leadership’ for the self-fulfilment and simplicity reasons I highlighted earlier mean these leaders, who are by all accounts bad at their jobs, continue to be highly rewarded – and probably more so that than less confident peers given their heightened sense of self-worth likely translates into salary expectations.

I doubt any of this is new. Much has been written about how “ignorance more frequently begets confidence than does knowledge” (Darwin). What remains surprising or perhaps depressing, is that for all the things we’ve learned about scientific management and about people and behaviors, is that we still reward sub-optimal behavior. Put another way, society seems to revere leaders that overestimate and under-deliver, and who are comfortable treating us as disposable minions to be crushed on the path to their own glory. And that doesn’t seem like progress.

What happened to enterprise frameworks?

I’ve been trying to decide how I feel about the rise of open-source libraries and frameworks at the expense of enterprise frameworks.


Before I dig in, I want to define some terms, at least as I understand them. To my mind, .NET Core is commercial because it is backed by a corporation and a framework because it does many things. Perhaps it is not fully commercial, as it’s marginally distant from their core business of selling Windows (compared to say Active Directory or the Win32 API) and I don’t know who pays the developers, but it still comes across as being an offering by a corporation. By comparison, most NodeJS libraries I’ve used are maintained by one or few developers and at best might have been side projects at a company but typically seemed unrelated to any enterprise so I’d class them non-commercial, and as their function tended to be quite specific (e.g. an XML parser, Open ID Connect client) I’d call them libraries. As with most things, these are not binary choices: commercial vs. non-commercial and framework vs. library are continua, with projects falling everywhere between the respective endpoints.

My transition to open source libraries

The bulk of my experience has been in the Microsoft ecosystem, but only in the last year have I started working with open source offerings, notably ASP.NET MVC 5 and now I’m very ensconced in .NET Core. In that year I was also involved in my first NodeJS back-end development.

Before this lack of open-source exposure casts me as some kind of Luddite, molly-coddled in the Microsoft way: 1. I’m referring to my business experience, where ‘time is money’, not the tools I’ve played with in personal projects; and 2. I’ve certainly used open source libraries and frameworks as a web developer – the main ones were PrototypeJs, YUI 2, jQuery, and ExtJS (before and after it became commercial). There were also plenty of small libraries used to fulfill specific web and back-end objectives – at one point I had to list them during a due diligence exercise and I’m pretty sure we got into the 30s. However the bulk of my development time has been written against frameworks and libraries that were either commercial (closed and open source, and usually free-of-cost) or very mature.

Thus in the last year I have gone from coding against predominately mature open source or closed source commercial frameworks to coding against a wide mix of small and large open source frameworks and libraries, and I’ve often found this transition to be detrimental to my goal of building an application to meet a business need. And thus we can conclude the introduction having reached the purpose behind this post: to elaborate on my thoughts about the consequences of open source on building software products.

My negative experiences with open source libraries

The area where NodeJS was starkly different to my previous experience was that many of the functions needed to make a basic application required an external library. The most memorable of these was finding an XML parsing and manipulation library. I don’t recall how many libraries I tried, but ultimately none of them represented a holistic solution for XML in the way System.XML or DOMParser does. Looking back now I don’t recall which ones were tried and why they didn’t work (possibly TypeScript related at times) or even which one we eventually settled on, I just remember it being an annoying process that took us away from actually building a product. And I know NodeJS is all about JSON as a fundamental data structure, but XML is still everywhere and has to be consumed and produced all the time so for a major environment to be without a sanctioned and reliable XML manipulator was, well, a culture shock.

Partial Implementations

The XML library experience illustrates one common characteristic of open source libraries which is they only tend to implement some sub-set of a specification or expectation.

The challenge then is to know what part of the specifications are implemented and how correctly those parts are implemented. In some cases the library wiki or documentation provides a guide on what it does or doesn’t cover, and in some cases a suite of tests hints at correctness. Ultimately the only reliable way to learn if the library will do the job is to code against it and test the result.

I found this out the hard way recently. After following the documentation and struggling to understand why a key function didn’t work, I got the source code of the library, and managed to step through it to discover the functionality simply hadn’t been implemented. I also eventually found a vaguely related GitHub issue confirming that. That was nearly a day wasted which could have been saved by a short list on the GitHub landing page saying ‘this library supports these 4 (of only 5) major functions’.

To be fair this is not unique to open source. I recall with anguish the peril of straying off the beaten path with several mature or commercial libraries, where things that it felt like it should be able to do became prohibitively complex.

Poor Documentation

My biggest gripe with open source libraries is their documentation tends to be somewhere between poor and acceptable only for the primary use case. This is completely rational – if the contributors are primarily developers then their most effective contribution is to develop. As a result, there seems to be an acceptance that developers using the library will mostly have to help themselves via extensive searching or finding a public forum, like Stack Overflow, to get questions answered. This can be very time-consuming (especially when time-zones don’t match up) and again detracts from building business value.

Whereas a paid library typically comes with support, and as it is in the best interests of the company to minimize expensive human support time, they provide excellent documentation, forums, and other ways for people to help themselves easily.

I have to say that I’ve worked in the technical writing industry, and there is a substantial difference between what developers and good technical writers produce as documentation. Technical writers have an understanding of how people learn and come at concepts from multiple angles, and can be systematic about identifying what is and isn’t covered.

The framework that illustrates this point most effectively at present is .NET Core. On the surface it looks like there is significant documentation, but compared to what MSDN provides for the .NET Framework, it is missing a great deal: lack of namespacing (I curse at the amount of time I spend tracking down namespaces); API references lacking examples of both class and property use; inheriting classes missing inherited properties; poor cross-referencing between concept topics and API topics; shallow concept topics.

It’s entirely possible Microsoft has been paying technical writers to do this and I am therefore perhaps criticizing their investment levels rather than the style of content, in which case it is a problem of commercial priorities rather than open source in general.

Boring Stuff

Speaking as a developer, creating new functionality is fun. Fixing bugs is not fun, neither is writing extensive automated tests, or localization. And if you’re a great developer but struggle with written communication then taking time to document or to textually support library users seems like a really poor use of your time. So given a choice between expanding the library, and perhaps gaining the pride of greater adoption, or making minor fixes, what is the rational choice?

This is the natural consequence of removing both the support of a wider organization with customer support, documentation, and QA specialists; and removing the commercial incentives to meet paying customers’ needs. It is much easier to ignore a tricky issue if no-one is paying for it.

Let me be clear that I’m not denigrating developers here – most developers I’ve met have a strong sense of pride in their work and will do their best to provide quality and capability, but ultimately are limited in the time and desire they have available.

And again, this problem isn’t unique to open source. Companies make the same trade-offs all the time, often to their paying customers ire, and can get away with it because it costs the customer too much to change away from them.

But Open Source == Freedom, Right?

Having cast aspersions on open-source libraries for several paragraphs, it is time to throw out some of the positives.

Top of my list of the benefits of open source is that the initial barriers to entry have basically evaporated. Do you have a computer and can code? Then you can solve someone’s problems without it costing you more than your labor plus the electricity to run the computer.

I’m careful to say initial here, because the concerns above are certainly barriers in themselves, but they tend not to strike early on in development because we usually start out following the paradigm of a given library, and only when we stretch away from its core capabilities do we encounter some of the aforementioned problems.


Unless the library in question is dead (i.e. no longer being maintained), I’ve found that issues generally get fixed faster. This may be because smaller teams are more nimble, or that open source developers are often top-shelf developers adhering to many of the practices than enable fast turnaround like good test coverage and continuous integration. Companies tend to be less responsive because they have greater inertia, which comes from the time cost in organizing across departments as well as teams. Some of that inertia is in providing things like the documentation or localization, so being responsive does come at a price.


With open source libraries you are not dependent on a vendor’s whims to get issues resolved. Instead there is the option to download the source and step through it to figure out what is going wrong, potentially fix the problem, and submit the fix back to the library so future versions will include it (having your own branch of a library is not desirable).

With the source code it is also possible to deeply understand what the library is doing and better understand how it was designed to be worked with. Source code is the ultimate documentation, and it is even better if it comes with a decent test suite.

But all this comes with a price – time. Trying to read and understand unfamiliar source code is a complicated and time consuming activity, and compared to a developer familiar with the code, it may take orders of magnitude longer for an unfamiliar developer to fix a problem.


I didn’t come through this with an agenda. The negatives are longer than the positives simply because it is easier to find things to complain about than to itemize what works. I’ve had some of these thoughts for a while and wanted to put them all down and think about it.

I think, in summary, that as an industry we’ve decided to trade time for money. Instead of paying for tools that are warranted to do the job by their vendors, we go free, but spend more time figuring out how to use the tools because the documentation is limited and the implementation is less complete than what that documentation might lead people to expect.

The first resulting question is, is this a good use of our time (i.e. company money)? Developers are expensive. Having developers diverted from their business mission because of tool challenges could be considered wasteful, or it could be considered the cost of doing business.

The next question is, is this what we (developers) want to be doing? Sometimes the answer is yes – we want to be continually learning or on the cusp of new technology; but sometimes it is no – we simply have a job to get done. What is more useful is better ways of telling what libraries are good and what they are good at. Obviously better documentation would help, but aggregators that work on networks and statistics are also very useful. For instance the download counts in nuget or npm, the scoring in npmsearch, or the Google rank tell us about the adoption of the library which is assumed to correlate with library value. The downside of putting too much emphasis on scoring tools is that it solidifies the status quo and therefore limits innovation. Is accidentally being early-adopters and getting angry with new libraries an acceptable price for the industry to pay to allow innovation to prosper?

And finally, have I identified the wrong problem? Much of what I’ve noted is also a feature of many less mature or less widely used commercial libraries. Is what I’ve observed actually the consequence of my recent career transition which could also be described as from mature and conservative frameworks to newer and less tested ones? For instance, what would comparing the documentation between .NET Framework 1.1 and .NET Core 1.1 be like?


I’ve chosen to end with questions because I don’t have the answers. There are many trade-offs and different people in different circumstances will have different valid answers. There are undoubtedly frustrations with open source libraries, just as there are in commercial ones. There are also improvements that can be made to the open source ecosystem, like encouraging the involvement (through pride) of technical writers and QA experts to improve the quality of what we consume.


Full credit to the team who runs the Geonet website. For those who don’t know, it’s the website of the New Zealand crown research institute that handles all things seismic. Like many New Zealanders I’ve been to the website a lot since the recent Kaikoura Earthquake and I’ve never once experienced performance problems.

What I find particularly amazing is the scale they must be capable of handling. I imagine the normal website load is fairly light, perhaps in the order of a few hits per second. But following a major event like the earthquake, half of New Zealand, and I imagine a fair few others, are all over the website. I’ve got friends and family who talk about refreshing it every few minutes to see if that “really was an earthquake”. I’m guessing they’ve been handling tens or hundreds of thousands of hits per second in the last few weeks, maybe 4 orders of magnitude greater than usual.

I was interested to know what powers this (I’d hate to think my tax dollars were being wasted on resources that were mostly underutilized 🙂 ). Their blog indicates that they’re using Fastly which provides, in Fastly’s words, a “real-time CDN”. I imagine this event was a pretty good test of Fastly’s capability and Geonet’s integration, and from what I’ve seen, they performed very nicely.
Well done Geonet and Fastly.


Welcome to my software development blog. I started this as a place to document my investigations into development languages and technologies that are unfamiliar to me.

After twelve and a half years full-time (and then some) in the industry, I’ve decided to take a break from the workforce and instead take the opportunity to delve more seriously into technologies that have captured my attention. So my intention in this blog is to create a series of small projects, linked together by tag, as well as more general posts tailored to a particular itch.

I haven’t enabled comments on the blog as I don’t want to be a moderator, and there are many other placed on the internet (I personally tend to lurk on reddit) which are better suited to commenting and come with moderation.