No Pure Ideology

Welcome to another essay reaching Brooks’ “No Silver Bullet” conclusion.

Genesis from Analogy

I’ve been reading Klein’s somewhat harrowing The Shock Doctrine and I found an interesting (to my twisted mind) analogy between the application of pure capitalism and the application of pure software concepts/paradigms/technologies, or as I have titled this, ideology.

Around 1990 the Cold War was over and capitalism had emerged as the triumphant economic ideology. But this capitalism, which had underpinned the great scientific and social achievements of the 1950s-1970s, was very different – vastly more state controlled – than the free-market approach of ‘pure’ capitalism espoused by Friedman, and with which we are more familiar today. During the 1990s there were major attempts to achieve this pure capitalism in Russia and East Asia, and both were dramatic failures. The attempted conversion of Russia to capitalism was a disaster for all but a few scavenger capitalists and even the IMF admitted that its approach to the East Asian financial crisis late in the decade significantly exacerbated what was a fairly minor problem. Since then, such unfettered free-market reforms, ‘pure’ capitalism, have become viewed with much more suspicion and caution, and it’s quite ironic that in the demise of its ideological enemy, communism, capitalism should lose such face.

My supposition is that no ideology when enacted in its purest form will be successful, as there will always be aspects of the target domain that the ideology doesn’t address effectively. Oddly enough most economic-political ideologies fail in the same area, which is the negative impact they have on the population, a population which typically has no input into the selection of the ideology. The most successful outcomes arise when ideologies are mixed and moderated, as in ‘Swedish socialism’ or the aforementioned success of the tightly controlled United States of the third quarter of the 1900s. I believe that this mixing of concepts to achieve success applies just as much to software development as it does to nation building.

Back to software development

When I started software development, object-orientation was the great new hope, heralded through its flagship, the recently released Java. Java, with its fairly pure object-oriented implementation, solved the problems of imperative languages like C and Pascal through encapsulation – the idea that data and the actions that manipulated it should be bound together and hidden behind some contract.

Except that Java sometimes made things more difficult. One aspect that has consistently bugged me is event handling, where Java demanded an extra object to implement an event interface separately to the object that wanted to respond to the events. By comparison, C was happy with a function pointer (albeit hidden behind some typedef because the type signatures for function pointers were not exactly intuitive).

Another example of Java’s purity, although not to object-orientism, was the need to state all the exceptions which a function could possibly receive. This meant that aggregating functions either needed to have a ridiculously unmanageable list of exception types appended to them or that every function was decorated with the base exception, making the whole thing pointless.

C#, some five years later, was much more pragmatic about both of these cases, supporting function references (delegates) and keeping exception handling in the code body rather than in declarations. C# is a much less pure object-oriented language than Java, but (in my experience at least) is better for ‘getting things done’.

Another example

Functional programming as a pure ideology has struggled get to traction outside of academia. For all the expressive power of being able to write ‘what’ a program needs to do rather than ‘how’ to do it, in its purest form functional programming makes it surprisingly hard to do the basic computational task: modify data. I’ll admit it’s been a while since I last got my head fully stuck into monads and lenses, but I recall genuinely understanding these to be fairly challenging compared to understanding, say, manual memory management and pointer manipulation.

In its attempt to bring functional programming more mainstream, F# takes the non-pure approach of allowing explicit mutability (while defaulting to a functional style). Naturally this weakens the benefits of the pure paradigm, but it also allows people to get things done! Again, the pragmatic compromise provides a more workable solution, while the pure concept remains a curiosity.

Conversely, functional concepts, like list-processing, have made their way into other paradigms where their expressive power is a welcome respite from pure imperative constructs, like nested loops.

Conclusion

Whether it’s micro-services, CQRS, REST, TDD, or any other of the myriad of software concepts, the costs of a pure implementation of a concept always outweighs the benefits. This is because, like with our political ideologies, each concept focuses on a specific type of problem – its core domain – and simply can’t handle all the situations that real software requirements will throw at it. Every concept has its pros and cons, problems where it is a good fit and places where it isn’t. To solve problems effectively we must understand what options best fit in a given situation and how a combination of concepts can be applied to achieve the desired end in a quick, inexpensive, and good-quality fashion. Or as Brooks’ put it, there is no silver bullet.

Criticizing the Past

Hindsight is supposedly 20/20, yet when it comes to appraising existing code and solutions we often run our mouths with the blinders on.

Existing code nearly always seems poor because when people look at code they look with their current perspective and not with the perspective of the time it was written. As the writer, we’re likely to have learned new coding techniques or learned more about the domain, and had we known those things we’d have chosen different approaches. As the reader, we seldom understand the constraints that were in place when that code was created i.e considerations like time-to-market, customers demanding fixes, and the time and resources available to developers for research.

Human Factors

It is important that we evaluate past decisions – we can’t learn without analysis – and very often it is necessary in order to allow changes to the solution. Trouble arises because we’re human and have feelings, and don’t particularly enjoy being criticized, even in the pursuit of learning.

One solution to the human problem, egoless programming (summarized here), seems like a pretty healthy concept. However, it essentially asks people to be detached from what they’ve achieved, which to me seems like a recipe for mediocrity. I’m from the software craftsman camp, where people take pride in the work they’ve done (while still accepting that nobody’s perfect); and people take better care of the things they have pride in.

Feelings exist, so how do we manage that? Egoless programming highlights a key message that is surprisingly often forgotten: “Critique code instead of people”; or to use the sporting metaphor I prefer “play the ball not the player”. At times it seems like people forget that other people are in the room when evaluating old code – rather than “this doesn’t make sense to me”, it’s “this is spaghetti-tangled rubbish” or “what idiot put this together”. At some point our frustration at trying to understand something turns into an utter lack of tact, and as peers we shouldn’t tolerate that – we should be quick to point out the potential for hurting feelings and our own limitations.

Reader’s Perspective

As mentioned in the introduction, anyone reading code or analyzing solutions brings with them their perspectives based on their knowledge and experience. When that experience doesn’t include the relevant business constraints, their evaluation can sound horrible – like the coding equivalent of saying “you’d be much better off if your parents had paid for a better school” to someone from a poor background – circumstances matter!

Our experience can also bias us towards solutions that we understand, and make us uncertain of different approaches. It seems anachronistic to have to say this in the era of equality, but just because something is different doesn’t mean it’s wrong! And we don’t learn anything by sticking to what we know!

Instead, solutions should be considered and appraised for what they are, while being considerate of circumstances. For instance, I find the stack depth that results from the design patterns heavy approach used in Java quite frustrating to debug. But I also appreciate that Java was envisaged as a pure object-oriented language which places some significant constraints on what the language can do, and in particular excludes first-class functions. Rather than saying anything too negative about Java, the lesson I’d take from this is that it’s good to be pragmatic rather than ideological when designing a language.

Conclusion

Don’t get me wrong – I haven’t written this from the perspective of a saint. I’ve been on both sides of the scenarios I’ve mentioned. The key is that I’ve learned from those experiences and hope by sharing that others can learn too.

Finally, don’t forget the lesson of self-deprecating humor: criticizing your own code in negative terms is fine; criticizing someone else’s code in those terms is not.

Hype

I like shiny new things.

It’s pretty universal that we humans are attracted to what is new, the latest-and-greatest, whether out of the likes of curiosity, restlessness, or envy. On top of this is layered a positive feedback cycle – our desire to be part of the group and be in on the topics of conversation – that reinforces the shiniest and most popular objects out there. These can be called fads or trends, and at times they are rightfully the center of attention, but when the reality can’t live up to the expectation then we’ve entered the world of hype.

It has been very clear right from my first days at university that ongoing professional development is essential to maintaining employability, and few industries move as fast software development. Learning about new technologies by playing with them a little is both healthy and important, and gives us an ability to make better choices about them in our professional careers. What I want to focus on, however, is when businesses succumb to the hype.

A business becomes a victim of hype when it adopts a new technology for the sake of adopting it rather than considering whether it adds business value. Lest I sound like a stuck-in-the-mud, there are a myriad of ways that new technologies add business value: they make development faster, more flexible, or interact more smoothly with other enabling products; they may simply have lower costs, whether by virtue of licensing, labour, or lifetime/support costs; they may make the company more attractive to potential employees.

But a new technology also comes with costs: time for individuals to come up to speed with the technology and ecosystem; reduced ability to deliver product and therefore respond to organization needs and competitive threats (Joel Spolsky described rewriting your codebase as the “single worst strategic mistake that any software company can make”); staff turnover as specialists opt to take their existing expertise elsewhere, and replacements require extensive training in the organization’s domain.

Failing to evaluate the costs and benefits of adopting new technology, and/or planning how it is adopted, can seriously damage a business. And yet it still happens, and I posit this is partially to do with our pursuit of the new and shiny, but equally important is the ‘resume factor’, which brings us to the next section…

Recruitment

Hype is driven by recruitment. This is more so in IT than many other industries where professional standards or outright experience tend to carry far more weight. In IT, there remains a prejudice that, because the industry moves so fast, existing practitioners are more likely to be lagging behind the forefront of the industry.

Firstly, this is a poor assumption. Whilst I have seen some individuals flat-line their professional development, the vast majority of people I’ve worked with are doing what they can to keep up with the changes and trends in the industry, both in their work and spare time. Secondly, with experience comes wisdom, and we need that to grow as an industry or we will keep making the same mistakes.

Finally, what behaviour is this going to drive in a mature developer? They’re going to put their weight behind adopting the new technology. As altruistic as we like to try and be, there is always a tension between career needs and business needs, and if there is an opportunity to improve career needs employees are likely to favor that over overall business value because they can move on more easily.

Is Newer Better?

In situations where you’re talking about replacing some part of an existing codebase, then a cost-benefit analysis has to compare the existing technology with the new option(s). And the first question that should come up is, is it better?

I know that sounds both obvious and a little stupid, but hype has a way of making the decision-making process turn a bit stupid at times. Perhaps I should clarify by adding the implied rest of the sentence: is it better for your organization?

At present my web framework of choice is Aurelia. Whilst it isn’t the best known, it has solid support, ongoing development, and just makes things easy. But I need to keep up with what is going on in the industry so I had a play with React, and have subsequently met it more (reading rather than writing) in my current job. Based on my needs, I feel like React is hype. Why (in my eyes)?… The way state and event management works is a big step back from the fairly seamless data-binding of Aurelia or Angular or most non-web UI frameworks, and it requires quite particular data flows for props. Whilst it doesn’t require Redux, I’m not sure I’ve ever seen a job ad for React that doesn’t include Redux, and Redux is massive overkill. I’m not sure I’ve ever needed to centralize state like that in a single page application, nor manage it in a pseudo-CQRS style. If I was building something as interactive as a spreadsheet then Redux might be handy, but the majority of web applications are still focused on one UI element at a time and can fetch data on demand without it really interrupting the UX. What React does very well is componentize – it certainly feels more natural to make components in React than Aurelia.

In short, my cost-benefit analysis says that Aurelia is better for my purposes than React. Decisions are always made with limited information, and people with more knowledge of React would certainly disagree with my assessment. And this is the point – my environment, background, and needs are different from theirs, so we should reach different conclusions.

Conclusion

I want to work with things that are new because I like to learn things, but I also work for a business that needs sound decision making that considers far more than ‘what’s cool’. As software professionals we have to keep on learning so we can provide and evaluate all the options to the businesses that we are involved with, but we also have to be able to step back from the hype and make sound technology decisions based on our environment and the good of the business.

Rate Limited Async Loop

A recent project included some modest load testing. For this we created a small console application to hit our API over HTTPS. A key metric in load testing is the number of requests an endpoint can handle per second, so it’s useful to be able to control and configure the rate at which requests are made.

This in itself is not difficult: a basic sleep wait of duration 1/requests-per-sec will achieve this. However we had an additional constraint that called for a slightly more complex solution.

The application uses Auth0, an authentication-as-a-service provider, and it rate limits use of its API. Exceeding the rate results in failed HTTP requests, and if frequent enough, can result in users being blocked. Furthermore, it is a remote and relatively slow API, with round-trip times in the order of 3 seconds (i.e. fetching 100 users serially would take 5 minutes), so it’s important that we access it concurrently, up to our limit. Additionally, the token received from calling it is cachable until its expiry, and if we can get the token from our cache then we want to skip any sleep-wait in order to minimize running time.

This leads to the goal: to maximize the number of concurrent requests made to an API up to a fixed number of requests per second; and to use cached data (and therefore not use a request) where possible. To solve this I want a rate-limited concurrent loop.

Implementation

A little searching on the internet resulted in either extensive libraries that implemented a different paradigm, like Reactive, or things that didn’t quite meet my requirements. I therefore – having taking the appropriate remedies to treat potential Not-Invented-Here Syndrome – went ahead and put something together myself.

public class RateLimitedTaskProperties
{
    public bool IgnoreRateLimit { get; set; }
}

public static async Task RateLimitedLoop(int perSec, IEnumerable enumerable, Func<T, Task> action)
{
    int periodMs = 1000 / perSec;
    var tasks = new List();
    foreach(T item in enumerable)
    {
        T capture = item;
        Task task = action(capture);
        tasks.Add(task);

        if (task.IsCompleted && task.Result.IgnoreRateLimit)
            continue;

        System.Threading.Thread.Sleep(periodMs);
    }

    await Task.WhenAll(tasks);
}

The loop starts a new task every periodMs. Concurrency is achieve by using tasks, which are non-blocking, and waiting for their completion outside the loop with await Task.WhenAll(tasks). The case where something has been retrieved from a cache is handled by the task returning synchronously and setting the IgnoreRateLimit flag. This combination causes the loop to skip the sleep and move straight onto triggering the next task.

The following is an example of its use, where MyOperation() is a method that returns a flag indicating whether or not it performed a fetch from the rate-limited API.

const int tokenReqsPerSec = 5;
await RateLimitedLoop(tokenReqsPerSec, items, async(item) =>
{
    bool requiredFetch = await item.MyOperation();
    // don't rate limit if I got it from the cache (fetch wasn't required)
    return new RateLimitedTaskProperties { IgnoreRateLimit = !requiredFetch };
});

django ms-sql datetimeoffset

My current project has me dealing with python, which is a language I’ve dabbled with for many years, but I think this is the first time using it professionally. It’s quite interesting seeing where the language has evolved: I recall having previously been enamored with the quasi-Lisp approach of processing lists with map and filter, but have found list and dictionary comprehensions to be the current standard.

The project is a small Django API (if I were selling it to a VC then it’d be called a microservice :rolleyes:) and as the back-end is predominately in the Microsoft stack, it references a SQL Server database. This database includes some DateTimeOffset columns. I’m not sure I see the need for this type – dates should always be stored as UTC for maximum portability, and clients can display the local time based on client settings. If it’s necessary for a service to work with those dates, then the database should store a user timezone name or offset, but that is specific to the user and not the date. Anyway, I digress… Unfortunately DateTimeOffset columns are not natively supported by the common python ODBC connectors, and thus something of a workaround was required.

This was made extra challenging by Django, which intermediates the database relationship via its models, and therefore thwarted some early attempts to treat the columns as bytes. What it does expose is a connection_created signal which allows the connection to be intercepted before it is used, and that connection includes a method add_output_converter for handling ODBC types. In this case the type is -155 and using a little struct magic we can construct a python datetime.

One area of concern was ensuring that the signal handling was tidied up, even if exceptions were thrown. To handle this, the DateTimeOffset handling code was wrapped into a class that supports the with statement.

Apologies in advance if some of this python code is highly naive – as already noted – it’s my first professional python foray 🙂

import struct
import datetime
from django.db.backends.signals import connection_created

class DjangoSqlDateTimeOffset(object):

    def __enter__(self):
        connection_created.connect(self.on_connection_created)

    def __exit__(self, exc_type, exc_value, traceback):
        # to see connection info, including queries, uncomment and look at cnx with settings.py DEBUG = True 
        # cnx = connections['qs-sql']
        connection_created.disconnect(self.on_connection_created)

    def on_connection_created(self, sender, **kwargs):
        conn = kwargs['connection']
        conn.connection.add_output_converter(-155, self.handle_datetimeoffset)

    def handle_datetimeoffset(self, dto_value):
        tup = struct.unpack("<6hI2h", dto_value)  # e.g., (2017, 3, 16, 10, 35, 18, 0, -6, 0)
        tweaked = [tup[i] // 10000 if i == 6 else tup[i] for i in range(len(tup))]
        dto_string = "{:04d}-{:02d}-{:02d} {:02d}:{:02d}:{:02d}.{:05d} {:+03d}{:02d}".format(*tweaked)
        return datetime.datetime.strptime(dto_string, '%Y-%m-%d %H:%M:%S.%f %z') 

With that class available, querying DateTimeOffset columns becomes nice and simple:

with DjangoSqlDateTimeOffset():
  item = ModelName.objects.raw('SELECT ... ')

PowerShell History

I do like PowerShell, but sometimes find myself pressing the up-arrow a lot to find commands made in previous sessions. Unfortunately the F8 search shortcut only works with the current session, so I wanted a way to find older commands more easily.

Knowing that PowerShell can retrieve history from older sessions, I assumed it must be stored on disk, and after a bit of guessing found this file: %appdata%\Microsoft\Windows\PowerShell\PSReadline\ConsoleHost_history.txt

To make it a bit more useful, I’ve removed common commands and duplicates using the following script.

$patterns = @("^cls", "^cd.*", "^\w:", "^exit", "^mkdir")

Get-Content "$env:APPDATA\Microsoft\Windows\PowerShell\PSReadline\ConsoleHost_history.txt" | 
    Select-String -pattern ($patterns -join "|") -notmatch | 
    Select -Unique |
    Out-File commands.txt

Auth0 Mock

Auth0 is a well-known authentication-as-a-service provider. Its database connection storage option allows organizations to reference a custom database, which is very useful if you want to store your user information with your business data and maintain integrity between those using foreign key constraints. You can do this in Auth0 by setting up a connection that accesses your hosted database (with appropriate firewall restrictions!) to add, update, and remove users.

A challenge with this is that each new environment requires a new database and Auth0 setup. This is particularly difficult if that environment is a developer’s machine and isn’t accessible to a connection string from the internet (due to Firewalls/NAT). One option is for each developer to have their own cloud database, but that gets expensive quickly, and adds unrealistic latency to database calls from their machine, making development more difficult.

I was faced with this problem while building integration tests using Auth0 and .NET Core, and opted to create a mock object.

Implementation

The top level interface for Auth0 in C# is IManagementApiClient. This consists of a number of client interface properties, and it’s these that I found most appropriate to mock using Moq. This leads to a basic structure as follows:

using Auth0.Core;
using Auth0.Core.Collections;
using Auth0.Core.Http;
using Auth0.ManagementApi;
using Auth0.ManagementApi.Clients;
using Auth0.ManagementApi.Models;
using Moq;

public class Auth0Mock : IManagementApiClient
{
  Mock _usersClient = new Mock();
  Mock _ticketsClient = new Mock();

  public Auth0Mock()
  {
    // setup for _usersClient and _ticketsClient methods
  }

  public IUsersClient Users => _usersClient.Object;
  public ITicketsClient Tickets => _ticketsClient.Object;

  public IBlacklistedTokensClient BlacklistedTokens => throw new NotImplementedException();
  // etc. for ClientGrants, Clients, Connections, DeviceCredentials,  EmailProvider, Jobs, Logs, ResourceServers, Rules, Stats, TenantSettings, UserBlocks
  public ApiInfo GetLastApiInfo()
  {
    throw new NotImplementedException();
  }
}

In this project only a small number of Auth0 methods were used (something I expect would be true for most projects), so only a few Auth0 client methods actually needed to be mocked. However it is quite important, for integration testing, that these methods replicate the key behaviours of Auth0, including writing to a database, and storing user metadata (which isn’t always in the database). To support these, the mock class includes some custom SQL, and a small cache, which are used by the mocked methods. The following code illustrates this using two methods. They are set up in the constructor, and implemented in separate methods.

using System.Collections.Generic;
using System.Data.SqlClient;
using Dapper;

private string _sql;

// local cache storing information that our sql table doesn't
private Dictionary _users = new Dictionary();

public Auth0Mock(/* injection for _sql connection string */)
{
  _usersClient.Setup(s => s.CreateAsync(It.IsAny())).Returns((req) => CreateAsync(req));
  _usersClient.Setup(s => s.DeleteAsync(It.IsAny())).Returns((id) => DeleteAsync(id));
}

private async Task CreateAsync(UserCreateRequest request)
{
  int userId = 0;
  using (var conn = new SqlConnection(_sql))
  {
    var rows = await conn.QueryAsync(@"INSERT INTO [MyUserTable] ...", new { ... });
    userId = (int)rows.Single().userId;
  }

  var user = new Auth0.Core.User
  {
    AppMetadata = request.AppMetadata,
    Email = request.Email,
    FirstName = request.FirstName,
    LastName = request.LastName,
    UserId = "auth0|" + userId
  };
  _users[user.UserId] = user;
  return user;
}

private async Task DeleteAsync(string id)
{
  var match = Regex.Match(id, @"auth0\|(.+)");
  string userId = match.Groups.Last().Value;

  using (var conn = new SqlConnection(_connStr))
    await conn.ExecuteAsync(@"DELETE FROM [MyUserTable] ...", new { userId });

  if(_users.ContainsKey(id))
    _users.Remove(id);
}

Being a mock object there are limitations. For instance, in this example the cache only includes users added via CreateAsync, not all the users in the test database. However where these limitations lie depends entirely your testing priorities, as the sophistication of the mock is up to you.

One downside to this approach is that Moq doesn’t support optional parameters, so the signatures for some methods can get quite onerous:

_usersClient.Setup(s => s.GetAllAsync(0, 100, null, null, null, null, null, It.IsAny(), "v2"))
  .Returns((i1, i2, b3, s4, s5, s6, b7, q, s9) => GetAllAsync(i1, i2, b3, s4, s5, s6, b7, q, s9));

private Task<IPagedList> GetAllAsync(int? page, int? perPage, bool? includeTotals, string sort, string connection, string fields, bool? includeFields, string query, string searchEngine)
{
  // regex to match query and fetch from SQL and/or _users cache
}

Authorization

The Auth0 mock class provides authentication, but not authorization, and it would be nice if any integration tests could also check authorization policies. The run-time system is expecting to process a cookie or token on each request and turn that into a UserPrincipal with a set of claims. Therefore our tests must also populate the UserPrincipal, and do so before authorization is checked.

For this we need a piece of middleware that goes into the pipeline before authorization (which is part of UseMvc()). My approach was to place the call to UseAuthentication() into a virtual method in Startup and override that method in the test’s Startup:

public class TestStartup : Startup
{
  protected override void SetAuthenticationMiddleware(IApplicationBuilder app)
  {
    app.UseMiddleware();
  }
  
  protected override void SetAuthenticationService(IServiceCollection services)
  {
    // This is here to get expected responses on Authorize failures.
    // Authentication outcomes (user /claims) will be set via TestAuthentication middleware,
    // hence there are no token settings.
    services.AddAuthentication(JwtBearerDefaults.AuthenticationScheme).AddJwtBearer();
  }
}

The middleware, TestAuthentication, remembers the last user that was set. It must be registered as a singleton with the dependency-injection framework so that the user is remembered between service calls. Testing code can set the user at any time by calling SetUser().

When a request is made TestAuthentication‘s InvokeAsync method applies claims based on that user. These claims will be processed as policies in the normal way so that Authorize attributes work as intended.

public class TestAuthentication : IMiddleware
{
  private string _userId;
  private string _roleName;

  public async Task InvokeAsync(HttpContext context, RequestDelegate next)
  {
    if (_userId > 0)
    {
      var identity = new ClaimsIdentity(new List
      {
        new Claim("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier", "auth0|" + _userId),
        new Claim("http://myuri/", $"Role:{_roleName}")
      });

      var principal = new ClaimsPrincipal(identity);
      context.User = principal;
    }
    await next(context);
  }

  public void SetUser(string userId, string roleName)
  {
    _userId = userId;
    _roleName = roleName;
  }
}

With this combination we are able to successfully mock Auth0 while retaining our ability to work with our database, test non-Auth0 functionality, and test authorization.