Value of Data

“Big data is the future” or so we are told. With enough data we can create models that provide us with good outputs for unknown inputs using an array of techniques like: using probabilities to estimate likely relationships; regression to find trends and interpolate answers; or by training general purpose learning algorithms.

In particular, Machine Learning (ML) is in vogue, and although the underpinning concepts aren’t new (I had assignments combining computer vision and artificial neural networks at university back in 2001), the capabilities of machines and easy access to massive levels of computing power now allow much more practical application of these concepts.

Regardless of the technology or the hype, there are universal concepts that are paramount to the successful application of a technology. For instance it is important to understand what the technology can and can’t do, and what properties are intrinsic and what are variable. Continuing with ML as an example, it is more effective to pre-process an image and extract key attributes and feed those into a neural network than to give it a million pixels per data entry.

One universal concept is that the technology needs to solve a real problem, or to use business terms, needs to ‘add value’. There is a cost to using a technology – for big data, collecting data can be expensive, notably in mitigating the risk of failing to manage the data i.e. ensuring it is secure and compliant. To offset this cost we need to establish value, which means asking:

  • How does having this give us a competitive advantage?
  • How can I monetize this?

For some of the big and famous organizations the answers to these are fairly clear: Amazon wants shopping data to provide better choices than competitors, drawing more customers and therefore more sales; Google and Facebook want information that targets their adverts to more of the right people, resulting in more buying per advert, incentivizing customers to buy more adverts.

One strategy for answering these questions is to create data which is so much better than competitors’ data, that customers will pay to access the data. This is not a new concept as software products have been up-selling reporting since time immemorial, but recently there seems to be more inclination to answer modelling questions rather than just provide charts. This is where the business questions need to be applied. For instance, if it is possible to mine data to answer questions like “what impact does doing X have on Y”, then ask yourself whether these answers are something that customers will pay for and competitors don’t have. If so, then you’re onto an excellent strategy. If not, then is having that data valuable?

React-Redux

I’ve been trying to get a clear picture of the data flows in React-Redux, and I thought I’d share that mental flowering. I’m still quite new with React-Redux so I’d suggest reading it with a note of caution but knowing that the end result was working software.

Redux

In Redux itself the data flow seems relatively straight-forward:
1. When something happens we dispatch an action to the store i.e. store.dispatch({ type: "TYPE_KEY", ... }).
2. Redux then calls the reducer which returns a new state back to Redux.
3. Functions that were previously registered using store.subscribe(fn) are called and those functions call store.getState() to get the new state.
This seems like a fairly classic observer pattern around an event-store.

If I were translating this to React then I’d expect that within my component I’d register a listener against the store that would call this.setState({ ... }). Then setState() would trigger the framework to re-render that component. The problem with this plan and react-redux is subscribe() is nowhere to be seen. Nor for that matter is store – so where did they go?

The answer is in the connect() method.

Connect()

When connect() is called on a component used inside a <Provider> tag, connect() automatically picks up the Provider’s store. This store is configured at the highest level of the application, something like this:

import { createStore } from "redux";
import { Provider } from "react-redux";
import reducer from "./state/reducers";

let store = createStore(reducer); // reducer is the outcome of combineReducers()

ReactDOM.render(
    <Provider store={store}>
        <Router />
    </Provider>,
    document.getElementById("example")
);

Because connect() encapsulates the store, it needs to provide an interface to allow the component to (implicitly) access the store. It does this via two arguments, which the documentation calls mapStateToProps and mapDispatchToProps.

mapStateToProps

mapStateToProps is effectively a filter that selects the bits of the global state from the store that are of interest to the component. The bits of state returned from this method are added to the this.props object.

This was the slightly counter-intuitive part in migrating from storing the data in React state. React documentation indicates that the data owned by a component should be stored in this.state. When it’s transferred to the store, that component is no longer strictly the owner and shouldn’t be changing it without informing the store, so it makes sense for it to be read-only, and therefore it moves to this.props.

mapStateToProps is the replacement for store.subscribe(). With it in place, any updates made to the store are filtered by the object returned from this method, and are then applied to the component’s this.props, causing the standard component updating lifecycle including the render.

mapDispatchToProps

mapDispatchToProps allows us to access the store.dispatch(). It provides the dispatch function as an argument, and should return an object containing methods that call dispatch with various actions. e.g.

const mapActionToProps = (dispatch) => {
    return {
        requestUsers: () => dispatch({ type: "REQUEST_USERS" }),
        updateUser: (user) => dispatch({ type: "UPDATE_USER", user: user })
    }
}

This enables the component to trigger actions by calling this.props.methodname(args).

connect(mapStateToProps, mapDispatchToProps) returns a function which should be passed the class or function of the component using the store. Convention also has that class returned as the default export so it can be directly imported. e.g.

// in component file
export default connect(mapStateToProps, mapActionToProps)(ComponentName)

// in file that is using the component
import ComponentName from "./ComponentName"

React-Redux Cycle

Putting it altogether we get a data flow something like this:
1. When something happens we dispatch an action to the store by calling an action property. i.e. this.props.requestUsers().
2. This calls dispatch(action-object) inside the method used in step 1, whose definition is found in mapActionToProps.
3. Redux then calls the reducer which returns a new state back to Redux.
4. The component receives the new state, filters it according to mapStateToProps and applies the filtered object to this.props triggering the normal React component update lifecycle and render.

Starting React

I’ve been dabbling with the concept of a remote part-time job to provide a bit of personal funding as our business venture slowly gets off the ground, and this has highlighted the current popularity of React. It’s not like I’ve had my head completely in the sand, but the number of roles requiring a specific JS framework, rather than just JavaScript (plus a framework preference) did surprise me a little.

With all that in mind I decided to take a proper look at React, and compare it to my current front-end framework of choice, Aurelia. The fruits of this exploration are in GitHub and hopefully not too embarrassing :).

Philosophy

There is a continuum in current web libraries between being a full framework, like Angular or Aurelia, and a view library, where non-rendering activities like routing and dependency injection are not included. React sits at the view library end of this spectrum, but with well established and supported choices for the excluded activities. In this respect I’ve found React easier to learn than Aurelia because it enables the ecosystem to be learned step-at-a-time – although it could also be said that significant experience with any of these frameworks would make learning the next easier.

React is strong on conventions, although at times it has felt like these conventions obscure what is going on and make ‘magic happen’, which is not a feeling I enjoy when developing. This is particularly so with react-redux.

It also leans more towards JavaScript’s dynamic typing than a static typing style, emphasized in the event handler example which does this.setState({ [name]: value });. However it also encourages the use of typed systems like TypeScript which requires some trawling through the index.d.ts file to come to grips with.

Components

React is driven by the notion of small and composable components, and it does make creating and working with these very easy. The use of JSX/TSX files allows advanced logic to be mixed with markup giving complete flexibility over the creation of elements. There is a risk to this, which is that a component gets bloated instead of being split into smaller parts, so good code maintenance practices are important.

Styling is fairly flexible, allowing anything from basic CSS use, referenced from the index.html, through to compiled and bundled SCSS using a packing system like webpack.

Data

My basic understanding is that data that is owned by a component goes into this.state and is modified by setState(), while data that is used by a component goes into this.props and is read-only. In a basic scenario (i.e. without Redux) data flows down via properties on this.props, and up via callback methods on this.props. e.g.

// in parent component
private handleChange(data) {
  this.setState({ filtered: data});
}

render() {
  <Subcomponent all={this.all} onChange={(data) => this.handleChange(data)} />
}

// in a subcomponent method
this.props.onChange(this.props.all.filter(f => f.something > 0));

This is certainly more laborious than Aurelia’s two-way binding, something React’s documentation acknowledges: “It can sometimes be tedious to use controlled components, because you need to write an event handler for every way your data can change and pipe all of the input state through a React component.” However this approach does make data passing and ownership very explicit and therefore understandable. Going to react-redux changes all this, and that’s something I’ll leave for a future post.

So there you have it – my first couple of days building something with React.

Using aurelia-ux v0.6

aurelia-ux is the pre-release UI component kit for aurelia. It has been developed throughout 2017 including a significant re-organization between v0.3 and v0.4. Due to this reorganization and its pre-release status, there is limited information out there on how to use it, with the best example at present being aurelia-store-todo. There is an indication that documentation is on the way and hopefully as part of that the team’s app-ux-showcase will be updated past v0.3. In the meantime I decided to dig into the source a bit and build a little something to share here.

Setup

This setup assumes you have aurelia-cli running, so that you can go to a command line and type au new to create the project. Open the project folder and add the following dependencies to the package.json then run npm install (or the yarn equivalent) to get the aurelia-ux packages into your node_modules folder:

  "@aurelia-ux/components": "^0.6.0",
  "@aurelia-ux/core": "^0.6.0",

aurelia-store-todo uses WebPack as the bundler, however I will be sticking to the aurelia-cli. This means editing the aurelia.json to include the dependencies. The first dependency is @aurelia-ux/core which contains the core capabilities. Then each component needs to be added individually. There is also an @aurelia-ux/components which may be intended to include all components, but including this causes build errors as the require exports for all the referenced components are not able to be resolved. The additions to the vendor-bundle dependencies in aurelia.json look something like this:

{
  "name": "@aurelia-ux/core",
  "path": "../node_modules/@aurelia-ux/core/dist/amd",
  "main": "index",
  "resources": ["**/*.{css,html}"]
},
{
  "name": "@aurelia-ux/input",
  "path": "../node_modules/@aurelia-ux/input/dist/amd",
  "main": "index",
  "resources": ["**/*.{css,html}"]
},
{
  "name": "@aurelia-ux/button",
  "path": "../node_modules/@aurelia-ux/button/dist/amd",
  "main": "index",
  "resources": ["**/*.{css,html}"]
},
{
  "name": "@aurelia-ux/form",
  "path": "../node_modules/@aurelia-ux/form/dist/amd",
  "main": "index",
  "resources": ["**/*.{css,html}"]
}

Using Components

The components are added as plugins to main.ts, like so:

aurelia.use
    .plugin(PLATFORM.moduleName('@aurelia-ux/core'))
    .plugin(PLATFORM.moduleName('@aurelia-ux/input'))
    .plugin(PLATFORM.moduleName('@aurelia-ux/button'))
    .plugin(PLATFORM.moduleName('@aurelia-ux/form'));

This makes them available everywhere, without needing further require tags.

Each component has a tag and properties which can be used in a template file. For instance the button is <ux-button> and can be bound to a view model variable using value.bind e.g. <ux-input value.bind="rows"></ux-input>. The elements appear unmodified in the output HTML and can be styled using CSS. Here is an example of using three components in a view model.
app.html:

<template>
  <require from="./app.css"></require>
  <ux-form>
    Rows:<ux-input value.bind="rows"></ux-input>
    Columns: <ux-input value.bind="cols"></ux-input>
    <ux-button type="raised" click.trigger="generate()">Generate</ux-button>
  </ux-form>
</template>

app.ts:

export class App {
    protected rows: number = 5; // arbitrary default
    protected cols: number = 5; // arbitrary default
    protected generate() { ... }
}

app.css (you may need to create this):

ux-form {
    padding: 2px;
    max-width: 300px;
    border: 1px solid transparent;
    border-color: var(--ux-design--primary-light)
}

CSS Variables

You may have noticed in the CSS above the use of a variable: var(--ux-design--primary-light). This is a mechanism for globally styling aurelia-ux, and is setup in the app constructor as follows:

import { autoinject } from 'aurelia-framework';
import { AureliaUX } from '@aurelia-ux/core';

@autoinject
export class App {
    protected rows: number = 5; // arbitrary default
    protected cols: number = 5; // arbitrary default

    // ux.design members: 
    //     appBackground, appForeground, controlBackground, controlForeground, 
    //     primary, primaryForeground, primaryLight, primaryLightForeground, primaryDark, primaryDarkForeground, 
    //     accent, accentForeground, accentLight, accentLightForeground, accentDark, accentDarkForeground
    constructor(private ux: AureliaUX) {
        ux.design.primary = '#DD0088';
        ux.design.accent = '#CC88FF';
        // unfortunately we don't seem to be able to change the tints by just setting the base color, so set them as well
        ux.design.primaryLight = '#DB9BBF';
        ux.design.accentLight = '#E7C9FF';
    }

    protected generate() { ... }
}

The name of the CSS variable is --ux-design--<property-name> where the property name is one of the ux.design members above. You can see them all in your browser’s CSS viewer:
CSS Design Variables
It is worth noting that CSS variables do not work in IE11 so if you expect to support that browser then aurelia-ux may not be suitable for you.

More

This is as far as I’ve gone with aurelia-ux so far, but there’s certainly more out there.

There are 11 components as it stands: button; checkbox; chip-input (a tag list); datepicker; form; input; input-info; list; radio; textarea; and switch. To find out what properties they expose download and run app-ux-showcase, or open your node_modules/@aurelia-ux/<name>/dist/amd/ux-<name>.d.ts and look at the public properties and methods.

There is also a themeing system which enables you to define your own component themes by extending implementations of the UxTheme class. Any properties in those classes can be used as CSS variables. To fully grasp this at present I’d suggest downloading the aurelia-ux source.

Automated Testing Priorities

There’s a common theme in automated software testing that the greatest effort should go into unit tests, lesser into integration tests, and least into UI tests. This is known as the testing pyramid. However I’m not convinced this is the best use of automated test developers for web applications, and I believe this is because the nature of web standards and a commonly used web application architecture invalidate some of the assumptions behind the pyramid.

Testing Goals

Before we continue we need to state the goals of automated testing. In my mind they are:

  1. Validate the correctness of the thing under test
  2. Allow refactoring of the thing under test

Note that correctness includes ‘non-functional’ things, like authorization security.

From a business perspective, we want to know that the product works as intended. Working as intended means that the application performs correctly when used through its interfaces. This would suggest that UI tests are the most important, which is the opposite of conventional wisdom.

The reason often given for placing less focus on UI tests is that they have been considered notoriously fragile. However I posit that this has been due to the nature of the interfaces being tested, which have tended to make identifying and interacting with UI elements automatically very hard; e.g. having to use control ids with WinForms applications. I’m also suspicious that less focus on UI tests is a consequence of Agile methodologies that insist on jamming all testing into the same cycle as development, resulting in automation trying to be written against a UI in an extreme state of flux.

Unit Test Fragility

One problem I have with unit testing is that developers are encouraged to test the internals of the unit. This happens when mock objects are checked to see if certain methods were called on the mock.

The purpose of functions and classes are that they expose some contract and hide the details of how that contract is fulfilled. Testing how a unit is doing its work means examining inside the black box, which defeats the purpose of using testing to support refactoring because now we can’t make a change to the implementation of a unit without breaking its tests.

UI Test Fragility

In his 2012 article Fowler says:

An enhancement to the system can easily end up breaking lots of such tests, which then have to be re-recorded. You can reduce this problem by abandoning record-playback tools, but that makes the tests harder to write. Even with good practices on writing them, end-to-end tests are more prone to non-determinism problems, which can undermine trust in them. In short, tests that run end-to-end through the UI are: brittle, expensive to write, and time consuming to run.

I believe that some of these assumptions are less valid in modern web test automation.

Automated web testing tends to be hand-written because (in my experience) the recording tools can create quite fragile paths, usually because they don’t know what the least variant information is. It is straight-forward to hand-write UI tests thanks to CSS selectors which are easy to use, well-supported, and when done simply (i.e. via id and class selectors rather than paths) aren’t hugely prone to change. These selectors are usually wrapped into page objects that further insulate the tests from changes.

The HTML DOM also exposes an event model which allows tests to mimic the vast the majority of UI actions, removing the complexity of older style tools which involved a lot of mouse-coordinates and button states.

And finally, in web development, UI testing has the added benefit of enabling testing across multiple browsers – something less applicable to downloaded applications.

However I agree that they remain time-consuming to run, and if there are lots of developers committing to the repository then having your continuous integration run on every commit may not be possible, reducing the benefit of the tests for quickly catching problems.

Middle-Ground – Integration Testing the API

It is increasingly common for web applications to be built as a web API and a web (JavaScript) client. This is my personal preference over server-side rendering as it nicely decouples the presentation from the logic and allows the application to more easily integrate with other applications. There is some development overhead in this approach, but given most web pages perform some dynamic interaction with the server thus requiring some level of client richness, this overhead is quite minimal.

Having an API provides an excellent place for automated testing. An API is a contract and will express most, if not all, of the business rules through its inputs and outputs. It also requires basic security, and allows validation and authorization to be checked. It can be easily extended to run more extensive security testing (i.e. by manipulating HTTP headers and sending malicious data) and performance tests.

Integration testing the API doesn’t mean a full environmental setup is required. It is still reasonable to use mocks for calls that are slow or resources that aren’t available. For instance my integration tests use .NET Core’s TestServer rather than running a web server, EF’s UseInMemoryDatabase rather than instantiating a database, and stub out AWS service calls. These are reasonable compromises because I’m confident those areas will perform to their contracts.

Conclusion

This is my ‘testing pyramid’ from highest to lowest priority:

  1. API integration tests
  2. Integration or unit tests for things that can’t be reached from the API
  3. UI tests for success scenarios

In my current application I have 98% code coverage and 90% branch coverage (largely ruined by not throwing exceptions inside all the C# using statements) of my business layer using the first two items on my list, and it has enabled considerable evolution and refactoring of the code-base over the last six months.

Generalists and Specialists

While I’m on my theme of people value, there is a group of technology professionals who are often quite undervalued – the generalists. Until the last couple of years there had been a trend of increasing specialization of technology development roles, notably the back-end/front-end split which is now being replaced again by that great generalist role, the full-stack developer. And here’s the thing – overall, the cost of generalists and specialists doesn’t vary heavily, with subject specialists – e.g. InfoSec, DBAs, Authentication Architects (I saw that one advertised in Wellington) – costing a bit more, and platform specialists – e.g. Java dev, UI dev – a little less. In this continuum of generalists to specialists, generalists represent an absolute bargain.

The Generalist Developer

An experienced developer can do 90% of what a specialist can do in their field.

Need DevOps? Why not get your developer to do it? They can research potential solutions, read and understand API documentation, pickup Bash or Powershell pretty quickly, and setup basic configurations based on recommended best-practice from vendors. Plus when they’re done, they can go back to development rather than twiddling their thumbs.

Need QA automation? Need requirements analysis? Need basic network setup? Need Project management? Need customer support? Need internal IT? Need architecture? Need a DBA? These are all things I’ve done to a production level in my 14 years primarily as a ‘developer’.

The vast majority of software out there is about understanding, automating, and transforming processes, and generalists are amply qualified to solve these problems. And where they can’t solve a problem from experience, they are expected to go out and research the plethora of alternatives, running a huge gamut of potential technologies (and therefore specializations), and pick a solution.
Sure, they may not create the algorithm that founds the next Google, but those companies represent a minuscule segment of the field and require a level of specialization more likely found in academia than in industry anyway.

In software product development you want generalists. These are people who know that, for instance, information security is important so they pick technologies, tools, and solutions that promote good security practice. And because they’re not as sure of themselves they are more likely to test and verify their assumptions and designs. They also have a wide view of the world, so can much more effectively evaluate trade-offs between different solutions and solve a wider range of problems than a specialist can. And typically, for little difference in price!

The Specialist

I’m not suggesting we don’t need specialists at all. I’ve found their value to be acting in advisory or consultancy roles where they are the checks-and-balances that warn people about traps and pitfalls and educate the generalists on best practices. I freely acknowledge I wouldn’t have some of the knowledge I have today without the support specialists have been able to provide.

However this very act of education decreases the relative value of the specialist because, by receiving more knowledge the generalists ‘level-up’ and reduce their knowledge gap in the specialist’s field. That makes the need for the specialist more tenuous, and some people find it challenging to overcome the instinct to protect one’s space. This assumes that specialists are static creatures, and I would expect they too are continually learning and trying to level-up, but within one organization the usefulness of their knowledge may be limited.

Another problem with specialists in a small organization, is that they effectively constrain your solutions. The company thinks, “well, I’ve got an Oracle DBA so we’d better use Oracle and stored procedures” even if it’s not the best solution. Whereas, a generalist will evaluate a range of solutions based on their experience, the company’s accumulated knowledge and environment, industry trends, future hiring needs, relative costs, etc. etc. to inform the solution choice.

Conclusion

If you’re a five person development shop, a specialist doesn’t make sense. If you need that expertise you should hire a consultant. If you’re a five hundred or five thousand person development enterprise, then those specialists should have plenty to do to make it worth having them on the payroll.

Perverse Incentives

Setting

It is well established that CEO pay has climbed astronomically in the last decades, and that over a similar period inequality has grown throughout the Western world. This period has also been dominated by laissez faire economics, and corporatism to the extent of corporate welfare.

There are some indications that we are swinging away from this setting, with even the IMF saying, “Empirical evidence suggests that it may be possible to increase [personal income tax] progressivity without adversely affecting economic growth” (IMF Fiscal Monitor, October 2017), but it will be some time, if ever, before the mind-set of actors in this economic dance changes.

Right now, you might be wondering whether this is a software blog or some economic philosophy one, but be assured, this relates back to management, and particularly how it impacts high-skill industries such as software development.

Introduction

Our current economic setting has created a system of incentives that is at odds with the goals of good management.

Good management in knowledge industries emphasises an ‘upside-down-pyramid’ of management which supports, rather than directs, the activities of skilled front-line executors. Put another way, it genuinely recognizes that people are the most important asset in a business and the role of management is to create an environment where those people with the skills can excel.

It is also clear that managers can add value in ways that others can’t. They can use their outright authority to connect otherwise separate groups, resolve conflicts, and use their bird’s-eye-view of the business to sponsor change and ensure strategic alignment, a.k.a. ensuring the right work is being done (HBR, 1995).

In our society we equate value with money, and given the greater value managers can add, pay managers more. We also expect more return from more money, so we expect increased responsibility and accountability from said managers. But here we reach the crux of problem: to support skilled staff it is important to empower them with responsibility, so how can the manager be held accountable whilst giving away the responsibility? To readers who (like myself) have been managers this is the “no kidding” moment, so I would ask the question, how have you tried to change that? This is a systemic problem and as far as I can tell our approach has been grudging acceptance of the status-quo.

How strong is that responsibility?

Good managers empower people, and people make mistakes, and it is unfair to hold the manager responsible for those mistakes, otherwise everyone would be too afraid to make mistakes and we’d destroy effectiveness and any hope of innovation. We also read in the media stories of obfuscating CEOs who (quite reasonably) admit they couldn’t have known that X was happening in their organization and so (will make some knee-jerk changed to resolve it, and) can’t really be held responsible.

By highlighting that they’re ‘not really that responsible’ the premise that with increased value comes increased responsibility has been completely undermined. Now this isn’t the only reason managers are paid more: The other commonly held notion is that managers add more value because their decisions have greater (financial) consequences. This I dispute largely because those decisions are never made in a vacuum, and there are a lot of advisors and analysts that effectively determine the decisions in the first place, with the executive being closer to a rubber stamp than a judge. But that would be a digression from the point of this post, which is to focus on the consequences of ‘having responsibility’.

Micromanagement

Responsibility encourages micromanagement. When your head is on the line then your instinct is to get as involved as possible to ensure the outcome you want. There are plenty of good managers out there who manage to overcome this instinct, but the system is very much setup to encourage micromanagement, and that destroys empowerment.

Under micromanagement, instead of having a team who are comfortable to pursue their own direction to achieve the goals, you’ve now got a sort-of team member (the manager) with a much more limited view of the details (after all, the higher up an organization you are the wider but shallower view you’re expected to have) who ironically requires more energy from the team-members to ‘manage’. This also makes the team feel less ownership for their work because accountability becomes blurred between the manager and the team. And instead of being measured by outcomes the team are judged on details; details that the manager is often not as qualified (as they think they are) to judge.

Micromanagers can also become the de-facto communication channel instead of the more effective approach of organizing teams to communicate with each other. This creates a knowledge and communication bottle-neck which is inefficient.

What does an effective manager do? They set a vision and goals, empower people to execute on them, provide cover so they can achieve their goals, resolve conflicts, and then put their feet up or have strategy off-sites or something like that. They should not be required to answer all the questions other people in the organization have – the team is the most capable to do that – but they can filter the questions to ensure the team can focus on the work.

But if your organization insists on you knowing everything because you are ‘responsible’ because you are expensive, then how can you be an effective manager?

Solution

So how do we fix this?

Essentially, be realistic with managers roles. If their roles are seen to be closer to text-book roles of planning, leadership, organization, and control rather than a know-it-all of everything in their fiefdom, then responsibility is going to fall closer to where it is held, with the executing team.

With this reduction in perceived responsibility and expectation there should be a reduction in compensation.

This will also improve empowerment among the teams which will give them a greater sense of ownership and satisfaction, meaning they’re less likely to turn-over.