(disclaimer: This article is also for OCaml developers, but I use F# as it is the language that I currently learning)
Intro
This article is for F# developers who want to learn something new, that I haven’t seen it being used in F# projects. It is something that made me move from Go to F#.
I swear that 20 years ago I promised myself that I would never touch .NET and here I am, promoting F# as a Golang developer. How did that even happen?
Well, three months ago, I researched the Odin programming language as an alternative to Go. Unfortunately, Odin didn’t fit the bill because it was missing backend libraries, but meanwhile I realized something that would shake my existence as a Golang developer.
Existential crisis
People who may not know, Golang developers exist around microservices, DevOps, and cloud. Now ask yourself, how do you think a Golang developer would feel if you told him/her that it is better, faster, safer, and cheaper to create a monolithic server on a dedicated machine without containers? But it is not only Golang developers who would shake because all developers now have moved to microservices.
Companies have moved to microservices because they believe that is the best way to handle software development. Microservices are a software architecture choice, not an optimization for performance. If you don’t believe me, then look where companies deploy software. Most of them deploy the microservices in containers on the same machine and use the “localhost” network to communicate between them. Wouldn’t be more efficient if they could replace the REST-API or gRPC with Go’s channels?
The question is: Why companies moved from monolithic to microservices? What do they try to avoid? I have two theories on that.
- My first theory is that they can’t parse properly the errors from huge stack traces from monolithic applications. Moving to microservices, they reduced the stack trace to three or fewer functions so they can handle the error parsing from services like ElasticSearch.
- My second theory is that companies can’t keep old employees around to explain the software architecture of large monolithic applications to newer employees and for that reason, they separate the software into smaller microservices, believing that they will be more explainable.
But did microservices solve these problems? No, I don’t think so, they just put one more complexity into the mix called DevOps and SRE. However, these two problems can be addressed with what I call Typed Stack Traces, or TST for short.
TST
Typed Stack Traces are like a train of Union types that return an Enum. Which means that for each function it will have a specific Union type as error type, that contains other Union types of other functions’ error types.
For example, imagine we have four functions called F4(), F3(), F2() and F1(). If F4() calls F3() and F3() calls F2() and so on, then the error that would be raised from F1() would have this error type “F4Error.F3Error.F2Error.F1Error.SomethingWrong “. The error is SomethingWrong, but its stack trace is “F4Error.F3Error.F2Error.F1Error.SomethingWrong”.
It may not have the information on the filename or the line of code like the ordinary stack traces, but these are useless information anyway.
For TST to work, you need Tagged Unions or Discriminated Unions and Enums. Currently, only Odin, F#, OCaml and Zig have, which means that you can’t utilize TST in other programming languages (prove me wrong if you can). So, if you try to replicate this functionality with “string” errors, then you will fail miserably.
Even though, only Odin uses TST accidentally in its core library, I believe that F# and OCaml are better fit for the task. The reason is that in F# you can combine all types in Union types, and they have better pattern matching.
Here is an example of TST in F#
type F1Error =
| NoError
| SomethingWrong
type F2Error =
| NoError
| AnotherError
| F1Error of F1Error
type F3Error =
| NoError
| F2Error of F2Error
type F4Error =
| NoError
| F3Error of F3Error
let f1 (): F1Error =
F1Error.SomethingWrong
let f2 (): F2Error =
F2Error.F1Error(f1())
let f3 (): F3Error =
F3Error.F2Error(f2())
let f4 (): F4Error =
F4Error.F3Error(f3())
// val tstErr: F4Error = F3Error (F2Error (F1Error SomethingWrong))
let tstErr = f4()
printfn $"%A{tstErr}"
// prints > F3Error (F2Error (F1Error SomethingWrong))
Why you need TST
You need TST to architect your software. Think of it like UML, but as types in code. If you are an F# developer, then you already understand the importance of modeling the software in code and TST is the way to model in code how functions call each other. For example, by looking at F4Error.F3Error.F2Error.F1Error.SomethingWrong, you know that F4() calls F3() and so on.
By modeling the software in code, then you preserve the architecture from undesired changes from junior developers that haven’t understood the product. Furthermore, it can separate the architects from the engineers, where the architect will work on the types and then the engineers will just implement the functions based on the types.
Moreover, the TST will solve the problem with parsing errors using pattern matching and switch statements that are more reliable than parsing stack traces from ElasticSearch. For example, an application raises an exception that a connection is closed, but you can’t know which connection or how it is closed from the code. The only way to resolve this problem is to create a CRON service that will check the state of connections and wake up an administrator if there is something wrong. With TST, you will never need to read the logs for errors or set up a CRON service. The architect will have to predict all the possible issues through trees of TST and how to handle each of them.
Moreover, the TST will make you write better error messages to the user rather than forwarding cryptic error messages from the system. Those cryptic error messages cost millions to the companies because they will have to install services to answer calls.
For example, if a connection is lost, then with TST you will have an error something like OpenAccount.CheckBalance.ConnectionLost. With that kind of error which tells us from where it came from through types, we can use pattern matching to send a message to the user that will say “We apologize for the inconvenience, but you can’t check the balance currently, please retry again in 10 minutes”. At the same time, it will send a different message to the administrator saying “Check why the server lost connection with Accounts DB”.
Additionally, by asserting TST in unit tests then you test the main logic of the application even if this logic is behind private methods that no unit test can touch.
Finally, this way you can do meaningful code coverage with unit tests where each test will cover all the branches of the TST tree.
How to architect your software using TST
As F# developer, you know already that DDD exists to architect the Domain (like the types of variables and state) and TDD exists to architect the logic of the application, but none of these will help you to architect the function calls. So I created a new way of thinking that I call Can’t Driven Development or CDD in short, to help you architect the function calls.
Maybe you will start thinking that I am trolling you, but I want you to see it philosophically. What defines a function it is not the types of the parameters or the type of the result, but the constraints it puts on the parameters. For example, If I would ask you which message system does not allow more than 160 characters per message, then the only answer you would give me is SMS. This and other examples like this proves that constraints define functionality.
If a function does one thing, then it does not return any error, but if it has constraints on the input that the type system can’t handle, then it will return errors. By defining the errors that each function may return, we can architect the most important parts of the software. For example, in your code you may have 1000 functions, but the most essential functions are only the ones that return errors. The function that return errors are the most important because these interact with the user.
CDD
To CDD, you need to define the domain first and then create “Can’t” statements for that domain following these steps:
- define if a function does exactly one thing. If it doesn’t, then it contains “Can’t” statements.
- function that return negative results are “Can’t” statements. Negative results are the ones that contain “Not” in their explanation.
- if an action uses resources, then for these resources, you need to create “Can’t” statements to handle predefined limitations.
After defining the “Can’t” statements, we go to the next steps to write them down as errors in the error type of the function.
Here is the list of rules that you have to follow to architect your function calls and start developing
- to find what functions we have to create, first we have to transform our “can’t” statements to errors and split them into functions based on a set of problems in the same domain. To achieve that, we need to refactor our error types in iterations, until the mental model is complete.
- first, take all the “can’t” statements and make them errors in an Enum for the main function.
- for the second iteration, split the errors in their own Enums based on common subject, and transform the main Enum to a Union type that contains all the newly created Enums
- repeat the above rules for all the newly created Enum types until all “can’t” statements are in their set of domains.
- define the order of errors in the types based on the order of calls in the functions.
- create the functions for the error types
- make sure they take all of their inputs from the parameters, and they don’t use any global variable.
- you can create functions that don’t return any error type to reduce the code length of your main functions, but you can’t add new errors.
- lock the errors with switch statements in tests, to make sure no one has added new errors because new errors will change the architecture.
- start developing your software with TDD by asserting TST trees first
Example in F#
To understand CDD in practice, I need to show you the thinking process through an exercise. For this exercise, let’s assume you signed up for an interview and the company sent you an exercise to finish in the next 8 hours.
Interview exercise
The company wants a REST-API method to greet new friends, but it also wants to block annoying people.
The HTTP method is GET /hello/{name} and it will have three types of responses:
- If the person calls the REST-API method for the first time, then the REST-API will respond “Hello, nice to meet you {name}”
- If the person calls the REST-API method for the second time, but after 60 seconds, then the method will respond “Hello, my friend”
- If the person calls the REST-API method repeatedly without waiting 60 seconds, then the method will respond with “Get away from me”
To finish the project, we need to follow a specific order of methodologies. First we need to architect the state with DDD, then architect the function calls with CDD and at the end architect the logic with TDD.
Architect Domain with DDD
On my previous blog post, I bashed on DDD. However, after watching “Domain Driven Design with the F# type System - Scott Wlaschin”, I was intrigued. Let’s use it for our project.
From the exercise, I understand that we need to create a function for responding to other people, remembering their names and categorize them as friends or not based on the last time we have seen them.
Let’s create the type of the person
type Person = {
IsFriend: bool
LastSeen: DateTime
}
Now we need to store this person with the rest of the people. Let’s create the type we will store the people.
type Storage =
let people = Dictionary<string, Person>()
For each person who says “Hello” to us, we need to respond based on the last time we saw that person. The exercise says 60 seconds, but I want it configurable for faster tests.
type Configurations = {
FriendUntilRepeatInSeconds: int
}
The exercise requests an action to respond to each person who says “Hello” to us. Let’s create the function, that returns a string but also its error type, to start architecting the function calls.
type HelloFromError =
| NoError
let HelloFrom (name: string) : string * HelloFromError =
HelloFromError.NoError
Thanks to DDD we established the domain, but this is how far DDD can get us. From now on, we need to use CDD methodology to architect the function calls.
Architect Functions with CDD
Let’s use each rule from CDD methodology, one by one, for this exercise.
- The current exercise requires for the software to return three types of responses, where two are errors (when the person is unknown or a spammer) and one positive (when the person is a friend).
- We will use RAM to save the list of people, which means that we need to configure its limitations from the start.
So here is the list of “can’t” statements:
- If the service CANT find the name from the list of people, then it returns an error that the name is unknown and adds the name to the list of people as friends
- The service CANT consider anyone as a friend who bothers the server constantly and for less than a minute.
- The service can’t accept new names when it reaches a limit.
- The service can’t accept names over a specific length, as it may consume resources that the hardware doesn’t have.
First iteration
Define all the “Can’t” statements as Enums in the error type of our main function “HelloFrom”
type HelloFromError =
| NoError
| NameTooLong
| Spammer
| Unknown
| NotEnoughSpace
Second iteration
Split the errors into functions based on the domain you think they should be.
type VerifyNameError =
| NoError
| NameTooLong
type AuthenticateNameError =
| NoError
| Spammer
| Unknown
type AddNameToFriendsError =
| NoError
| NotEnoughSpace
type HelloFromError =
| NoError
| VerifyNameError of VerifyNameError
| AuthenticateNameError of AuthenticateNameError
| AddNameToFriendsError of AddNameToFriendsError
Now we have the architecture of function calls. Furthermore, you can see that we put them in order based on the way the functions will be called. For example, we put AuthenticateNameError before AddNameToFriendsError because we know that we need to authenticate the person before we put it in the list of people.
Architect Logic with TDD
Now we are going to architect the logic of our application with TDD, which means that before implementing the functions, we will create the tests.
These tests will be meaningful because they will test TST of “HelloFrom” function.
let ``test name too long`` () =
let (msg, err) = HelloFrom("manoaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaas")
Assert.Equal("Get away from me", msg)
let mutable passedNameTooLong = true
match err with
| VerifyNameError vferr ->
match vferr with
| NameTooLong -> passedNameTooLong <- true
| _ -> ()
| _ -> ()
Assert.True(passedNameTooLong)
Sample of implementation
Thanks to DDD, CDD and TDD, I did a lot of refactoring on the types. For example, I put HelloFrom in a module and I made the other functions private. I even added an extra error (kind of useless) for concurrency failures.
Here is the end result
member x.HelloFrom (name: string) : string * HelloFromError =
let vername = x.verifyName name
match vername with
| NameTooLong ->
"Get away from me",VerifyNameError(vername)
| VerifyNameError.NoError ->
let anerr = x.authenticateName name
match anerr with
| Spammer ->
let uftserr = x.updateFriendToSpammer name
if uftserr.IsNameDoesntExist then
"Someone touched my memory", UpdateFriendToSpammerError(uftserr)
else
"Get away from me", AuthenticateNameError(anerr)
| Unknown ->
let antferr = x.addNameToFriends(name)
if antferr.IsNotEnoughSpace then
"Not enough space in brain", AddNameToFriendsError(antferr)
else
$"Hello, nice to meet you {name}" , AuthenticateNameError(anerr)
| AuthenticateNameError.NoError ->
store.UpdatePerson (name, {IsFriend = true ; LastSeen = DateTime.Now}) |> ignore
$"Hello, my friend", HelloFromError.NoError
For the rest of the code you can visit https://github.com/rm4n0s/hello_from_fsharp, but I advise you to not expect much as I started learning F# two weeks ago.
Conclusion
As you can see, thanks to TST, we now have discovered a new way of architecting, developing and maintaining software.
However, I wonder why this hasn’t been discovered earlier. Why we had to wait for 30 years? Was it discovered but dismissed as stupid? I don’t believe that it is stupid. It is the reason that made me move from Go to F# (and OCaml).
I believe that this will change how we develop, maintain and run software. It will let us go back to developing monolithic applications in waterfall project management and resurrect the software architects jobs, especially now where just one server can have hundreds of cores and hundreds of GB of RAM. It will replace DevOps and SRE with software engineers and system administrators.