Home > Posts > The untouched goldmine of F#

The untouched goldmine of F#

F# secret superpower that no one has discovered for 30 years, and it will change its popularity in enterprise software

January 12, 2025 · 17 min read

(disclaimer: This article is also for OCaml and Rust developers, but I use F# as it is the language that I currently learning)

Intro

This article is for F# developers who want to learn something new, that I haven’t seen it being used in F# projects. It is something that made me move from Go to F#.

I swear that 20 years ago I promised myself that I would never touch .NET and here I am, promoting F# as a Golang developer. How did that even happen?

Well, three months ago, I researched the Odin programming language as an alternative to Go. Unfortunately, Odin didn’t fit the bill because it was missing backend libraries, but meanwhile I realized something that would shake my existence as a Golang developer.

Existential crisis

People who may not know, Golang developers exist around microservices, DevOps, and cloud. Now ask yourself, how do you think a Golang developer would feel if you told him/her that it is better, faster, safer, and cheaper to create a monolithic server on a dedicated machine without containers? But it is not only Golang developers who would shake because all developers now have moved to microservices.

Companies have moved to microservices because they believe that is the best way to handle software development. Microservices are a software architecture choice, not an optimization for performance. If you don’t believe me, then look where companies deploy software. Most of them deploy the microservices in containers on the same machine and use the “localhost” network to communicate between them. Wouldn’t be more efficient if they could replace the REST-API or gRPC with Go’s channels?

The question is: Why companies moved from monolithic to microservices? What do they try to avoid? I have two theories on that.

My first theory is that they can’t parse properly the errors from huge stack traces from monolithic applications. Moving to microservices, they reduced the stack trace to three or fewer functions so they can handle the error parsing from services like ElasticSearch.
My second theory is that companies can’t keep old employees around to explain the software architecture of large monolithic applications to newer employees and for that reason, they separate the software into smaller microservices, believing that they will be more explainable.

But did microservices solve these problems? No, I don’t think so, they just put one more complexity into the mix called DevOps and SRE. However, these two problems can be addressed with what I call Typed Stack Traces, or TST for short.

TST

Typed Stack Traces are like a train of Union types that return an Enum. Which means that for each function it will have a specific Union type as error type, that contains other Union types of other functions’ error types.

For example, imagine we have four functions called F4(), F3(), F2() and F1(). If F4() calls F3() and F3() calls F2() and so on, then the error that would be raised from F1() would have this error type “F4Error.F3Error.F2Error.F1Error.SomethingWrong “. The error is SomethingWrong, but its stack trace is “F4Error.F3Error.F2Error.F1Error.SomethingWrong”.

It may not have the information on the filename or the line of code like the ordinary stack traces, but these are useless information anyway.

For TST to work, you need Tagged Unions or Discriminated Unions and Enums. Currently, only Odin, F#, OCaml, Rust and Zig have, which means that you can’t utilize TST in other programming languages (prove me wrong if you can). So, if you try to replicate this functionality with “string” errors, then you will fail miserably.

Even though, only Odin uses TST accidentally in its core library, I believe that F#, OCaml and Rust are better fit for the task. The reason is that in F# you can combine all types in Union types, and they have better pattern matching.

Here is an example of TST in F#

type F1Error =
    | NoError
    | SomethingWrong


type F2Error = 
    | NoError
    | AnotherError
    | F1Error of F1Error


type F3Error =
    | NoError
    | F2Error of F2Error


type F4Error =
    | NoError
    | F3Error of F3Error


let f1 (): F1Error =
    F1Error.SomethingWrong

let f2 (): F2Error =
    F2Error.F1Error(f1())

let f3 (): F3Error =
    F3Error.F2Error(f2())

let f4 (): F4Error =
    F4Error.F3Error(f3())

// val tstErr: F4Error = F3Error (F2Error (F1Error SomethingWrong))
let tstErr = f4()

printfn $"%A{tstErr}"
// prints > F3Error (F2Error (F1Error SomethingWrong))

Why you need TST

You need TST to architect your software. Think of it like UML, but as types in code. If you are an F# developer, then you already understand the importance of modeling the software in code and TST is the way to model in code how functions call each other. For example, by looking at F4Error.F3Error.F2Error.F1Error.SomethingWrong, you know that F4() calls F3() and so on.

By modeling the software in code, then you preserve the architecture from undesired changes from junior developers that haven’t understood the product. Furthermore, it can separate the architects from the engineers, where the architect will work on the types and then the engineers will just implement the functions based on the types.

Moreover, the TST will solve the problem with parsing errors using pattern matching and switch statements that are more reliable than parsing stack traces from ElasticSearch. For example, an application raises an exception that a connection is closed, but you can’t know which connection or how it is closed from the code. The only way to resolve this problem is to create a CRON service that will check the state of connections and wake up an administrator if there is something wrong. With TST, you will never need to read the logs for errors or set up a CRON service. The architect will have to predict all the possible issues through trees of TST and how to handle each of them.

Moreover, the TST will make you write better error messages to the user rather than forwarding cryptic error messages from the system. Those cryptic error messages cost millions to the companies because they will have to install services to answer customer’s calls.

For example, if a connection is lost, then with TST you will have an error something like “GetAccount.GetBalance.MySQLCall.ConnectionLost”. With that kind of error which tells us from where it came from through types, we can use pattern matching to send a message specifically to the user and another one specifically for the administrators.

Here an example of this scenario in F#:

type MySQLCallError =
    | ConnectionLost

type GetBalanceError = 
    | MySQLCallError of MySQLCallError

type GetStocksError = 
    | MySQLCallError of MySQLCallError

type GetAccountError =
    | GetBalanceError of GetBalanceError
    | GetStocksError of GetStocksError

let viewAccountPage accountID = 
    let accountRes = GetAccount accountID 
    match accountRes with
    | Error accountErr -> 
        match accountErr with
        | GetBalanceError gberr ->
            match gberr with
            | MySQLCallError myerr ->
                match myerr with
                | ConnectionLost ->
                    printErrMsgToUser "We apologize for the inconvenience, but you can't check the account currently, please retry again in 10 minutes"
                    informAdministrators "The server lost connection with the DB for Accounts"
    | Ok account ->
        showAccount account

Additionally, by asserting TST in unit tests then you test the main logic of the application even if this logic is behind private methods that no unit test can touch.

Finally, this way you can do meaningful code coverage with unit tests where each test will cover all the branches of the TST tree. Also, this way you lock the architecture by making the compiler warn you about any change in the TST tree.

How to architect your software using TST

As F# developer, you know already that DDD exists to architect the Domain (like the types of variables and state) and TDD exists to architect the logic of the application, but none of these will help you to architect the function calls. So I created a new way of thinking that I call Can’t Driven Development or CDD in short, to help you architect the function calls.

Maybe you will start thinking that I am trolling you, but I want you to see it philosophically. What defines a function it is not the types of the parameters or the type of the result, but the constraints it puts on the parameters. For example, If I would ask you which message system does not allow more than 160 characters per message, then the only answer you would give me is SMS. This and other examples like this proves that constraints define functionality.

If a function does one thing, then it does not return any error, but if it has constraints on the input that the type system can’t handle, then it will return errors. By defining the errors that each function may return, we can architect the most important parts of the software. For example, in your code you may have 1000 functions, but the most essential functions are only the ones that return errors. The function that return errors are the most important because these interact with the user.

CDD

To CDD, you need to define the domain first and then create “Can’t” statements for that domain following these steps:

define if a function does exactly one thing. If it doesn’t, then it contains “Can’t” statements.
function that return negative results are “Can’t” statements. Negative results are the ones that contain “Not” in their explanation.
if an action uses resources, then for these resources, you need to create “Can’t” statements to handle predefined limitations.

After defining the “Can’t” statements, we go to the next steps to write them down as errors in the error type of the function.

Here is the list of rules that you have to follow to architect your function calls and start developing

to find what functions we have to create, first we have to transform our “can’t” statements to errors and split them into functions based on a set of problems in the same domain. To achieve that, we need to refactor our error types in iterations, until the mental model is complete.
- first, take all the “can’t” statements and make them errors in an Enum for the main function.
- for the second iteration, split the errors in their own Enums based on common subject, and transform the main Enum to a Union type that contains all the newly created Enums
- repeat the above rules for all the newly created Enum types until all “can’t” statements are in their set of domains.
define the order of errors in the types based on the order of calls in the functions.
create the functions for the error types
- make sure they take all of their inputs from the parameters, and they don’t use any global variable.
- you can create functions that don’t return any error type to reduce the code length of your main functions, but you can’t add new errors.
lock the errors with switch statements in tests, to make sure no one has added new errors because new errors will change the architecture.
start developing your software with TDD by asserting TST trees first

Example in F#

To understand CDD in practice, I need to show you the thinking process through an exercise. For this exercise, let’s assume you signed up for an interview and the company sent you an exercise to finish in the next 8 hours.

Interview exercise

The company wants a REST-API method to greet new friends, but it also wants to block annoying people.

The HTTP method is GET /hello/{name} and it will have three types of responses:

If the person calls the REST-API method for the first time, then the REST-API will respond “Hello, nice to meet you {name}”
If the person calls the REST-API method for the second time, but after 60 seconds, then the method will respond “Hello, my friend”
If the person calls the REST-API method repeatedly without waiting 60 seconds, then the method will respond with “Get away from me”

To finish the project, we need to follow a specific order of methodologies. First we need to architect the state with DDD, then architect the function calls with CDD and at the end architect the logic with TDD.

Architect Domain with DDD

On my previous blog post, I bashed on DDD. However, after watching “Domain Driven Design with the F# type System - Scott Wlaschin”, I was intrigued. Let’s use it for our project.

From the exercise, I understand that we need to create a function for responding to other people, remembering their names and categorize them as friends or not based on the last time we have seen them.

Let’s create the type of the person

type Person = { 
    IsFriend: bool
    LastSeen: DateTime 
}

Now we need to store this person with the rest of the people. Let’s create the type we will store the people.

type Storage =
    let people = Dictionary<string, Person>()

For each person who says “Hello” to us, we need to respond based on the last time we saw that person. The exercise says 60 seconds, but I want it configurable for faster tests.

type Configurations = { 
      FriendUntilRepeatInSeconds: int 
    }

The exercise requests an action to respond to each person who says “Hello” to us. Let’s create the function, that returns a string but also its error type, to start architecting the function calls.

type HelloFromError =
    | NoError

let HelloFrom (name: string) : string * HelloFromError =
    HelloFromError.NoError

Thanks to DDD we established the domain, but this is how far DDD can get us. From now on, we need to use CDD methodology to architect the function calls.

Architect Functions with CDD

Let’s use each rule from CDD methodology, one by one, for this exercise.

The current exercise requires for the software to return three types of responses, where two are errors (when the person is unknown or a spammer) and one positive (when the person is a friend).
We will use RAM to save the list of people, which means that we need to configure its limitations from the start.

So here is the list of “can’t” statements:

If the service CANT find the name from the list of people, then it returns an error that the name is unknown and adds the name to the list of people as friends
The service CANT consider anyone as a friend who bothers the server constantly and for less than a minute.
The service can’t accept new names when it reaches a limit.
The service can’t accept names over a specific length, as it may consume resources that the hardware doesn’t have.

First iteration

Define all the “Can’t” statements as Enums in the error type of our main function “HelloFrom”

type HelloFromError =
    | NoError
    | NameTooLong
    | Spammer
    | Unknown
    | NotEnoughSpace

Second iteration

Split the errors into functions based on the domain you think they should be.

type VerifyNameError =
    | NoError
    | NameTooLong

type AuthenticateNameError =
    | NoError
    | Spammer
    | Unknown

type AddNameToFriendsError =
    | NoError
    | NotEnoughSpace

type HelloFromError =
    | NoError
    | VerifyNameError of VerifyNameError
    | AuthenticateNameError of AuthenticateNameError
    | AddNameToFriendsError of AddNameToFriendsError

Now we have the architecture of function calls. Furthermore, you can see that we put them in order based on the way the functions will be called. For example, we put AuthenticateNameError before AddNameToFriendsError because we know that we need to authenticate the person before we put it in the list of people.

Architect Logic with TDD

Now we are going to architect the logic of our application with TDD, which means that before implementing the functions, we will create the tests.

These tests will be meaningful because they will test TST of “HelloFrom” function.


let ``test name too long`` () =
    let (msg, err) = HelloFrom("manoaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaas")
    Assert.Equal("Get away from me", msg)

    let mutable passedNameTooLong = true

    match err with
    | VerifyNameError vferr ->
        match vferr with
        | NameTooLong -> passedNameTooLong <- true
        | _ -> ()
    | _ -> ()

    Assert.True(passedNameTooLong)

Sample of implementation

Thanks to DDD, CDD and TDD, I did a lot of refactoring on the types. For example, I put HelloFrom in a module and I made the other functions private. I even added an extra error (kind of useless) for concurrency failures.

Here is the end result

    member x.HelloFrom (name: string) : string * HelloFromError =
        let vername = x.verifyName name 
        match vername with
        | NameTooLong ->
            "Get away from me",VerifyNameError(vername)
        | VerifyNameError.NoError -> 
            let anerr =   x.authenticateName name
            match anerr with 
            | Spammer -> 
                let uftserr = x.updateFriendToSpammer name 
                if uftserr.IsNameDoesntExist then 
                    "Someone touched my memory", UpdateFriendToSpammerError(uftserr)
                else
                    "Get away from me", AuthenticateNameError(anerr)
            | Unknown -> 
                let antferr = x.addNameToFriends(name)
                if antferr.IsNotEnoughSpace then 
                    "Not enough space in brain", AddNameToFriendsError(antferr)
                else 
                    $"Hello, nice to meet you {name}" , AuthenticateNameError(anerr)
                
            | AuthenticateNameError.NoError -> 
                store.UpdatePerson (name, {IsFriend = true ; LastSeen = DateTime.Now}) |> ignore
                $"Hello, my friend", HelloFromError.NoError

For the rest of the code you can visit https://github.com/rm4n0s/hello_from_fsharp, but I advise you to not expect much as I started learning F# two weeks ago.

Conclusion

As you can see, thanks to TST, we now have discovered a new way of architecting, developing and maintaining software.

However, I wonder why this hasn’t been discovered earlier. Why we had to wait for 30 years? Was it discovered but dismissed as stupid? I don’t believe that it is stupid. It is the reason that made me move from Go to F#.

I believe that this will change how we develop, maintain and run software. It will let us go back to developing monolithic applications in waterfall project management and resurrect the software architects jobs, especially now where just one server can have hundreds of cores and hundreds of GB of RAM.

I believe in a world where software architects exist. The architects will create the types and engineers will implement the functions. The engineers will be able to refactor the functions as much as they want, but they will never be able to refactor types without the approval of the architects.

A world where DevOps and SRE are replaced by system administrators.

(UPDATE) FAQ

Q: “Ah! you just discovered ADT, that is all!”

No, I discovered a better use of ADT for errors and away to mitigate any possible failure from the start. While you concatenate your errors so you can have pseudo stack traces in your logs.

We are not the same.

Q: “Typescript, Kotlin, Scala, Go etc can do the same thing!”

No, they don’t have TST. But if you believe that your favorite programming language has something similar to TST, then solve the “Error Handling Challenge” and send me the solution in Bluesky to put you in the list of challengers who solved the problem.

Q: “Why are stack traces useless?”

They are useless because you don’t know when they will happen, and when they happen it is too late!

I don’t care if the stack traces are from exceptions or “error wrapping” (concatenated string of errors to create a pseudo-stack trace), they are still useless.

The point of TST is not to read them somewhere in the logs when your system crashes or when you receive a complaint from a customer to fix a bug.

The point of TST is to replace logs with predicted error handling.

No more screams of “The software crashed! Give me an hour to figure why in the logs”

The yelling with TST will be like “The software notified me with SMS that the Account’s DB is down, why the DB is down?”

Q: “Why you used ‘string * HelloFromError’ and not Result<string,HelloFromError>”

Because I came from Odin and this is how they handle errors. It will take some time until I learn F# style.

But in my defense, I used this style on “viewAccountPage” function in “Why you need TST” section..

Q: “Isn’t too excessive to pattern match all the TST tree?”

Yes, it is excessive.

You can pattern match half the tree and return a general error.

You can pattern match only a leaf of the tree.

Furthermore, you can even if-cond a specific error enum from the TST tree.

But by pattern matching all the tree, you predict every failure, and you handle it responsible, so nothing will ever surprise you.

Q: “What is the difference between GetAccount.GetBalance.MySQLCall.ConnectionLost and GetAccount.GetStocks.MySQLCall.ConnectionLost, if only what matters is MySQLCall.ConnectionLost”

If you use one DB for the software, then you can just call if-cond on MySQLCall.ConnectionLost if you like.

If you use three DBs, then you can still use if-cond on MySQLCall.ConnectionLost as a type that includes the database’s name as well.

The point of going through the whole pattern matching is to predict and handle every case.

Also, to give a more detail explanation of the problem to the user who paid lots of cash. So he/her doesn’t have to call you back middle of the night and make you read the stack trace out loud from the logs.

←

Can't Driven Development