La Technique

About the Leornian project

A few weeks ago, I completed the first version of Leornian and deployed it at leornian.info. Leornian is a simple web app that I mainly worked on as a learning exercise, but it’s also the implementation of an idea that I’ve been brewing for a while now.

The website’s About page provides the elevator pitch-style intro for the app; this blog post provides extensive background about the project: its motivations, conceptualization and design, and the things I’ve learned while building it.

Motivations

Although I’ve built, maintained, and supported many software applications in my career, I haven’t yet had a personal, open-sourced passion project. All applications I’d worked on were proprietary, so my public GitHub account was looking quite empty.

I decided to build Leornian, not just to expand my public portfolio, but also as a learning exercise. I wanted to build an application at my own pace, without the pressure of deadlines, giving me space to identify the gaps in my skills, to revisit the topics I had to learn before in a rush, and to further master the fundamentals of web application engineering. I had time to check out many interesting topics: while building Leornian, I studied how MediaWiki optimizes a type of SQL query; investigated PostgreSQL’s implementation of the random() function; and had an exchange with a Django core developer regarding a proposed ListView API change.

From the beginning I knew I’d be using the Django framework for the project. The usual advice goes that the needs of the project should dictate the choice of technology, and therefore the choice of a framework should be taken later on, not at the outset. Not only is this advice often impractical—with an existing developer team, it can be more economical to leverage their existing skills and technology preferences—but my project is also a learning exercise for which learning a new framework is not an objective. With Django and Python, the tools I’m most fluent in, I could hit the ground running.

Conceptualization

I wanted the product to be of practical value, at least for myself; it could not be just a toy exercise. That way, I expected to learn more from the project.

The core idea of Leornian is that it’s a simple note-taking app with a function for resurfacing contents. This is not a particularly unique or novel concept, and indeed there are many similar apps already out there. But as with any software product, the differentiation comes from the particular details of implementation, and these specific design decisions flow from the particular use cases the developer has in mind.

In the case of Leornian, my original idea was for an app for saving and reviewing ‘notes to self’. There is no shortage of note-taking apps, but there aren’t as many that focus on the reviewing or content-resurfacing aspect. My name for this original concept was “Quoach”, from “coaching through quotes”.

‘NoteHub’

Soon after having that idea, I found an app that satisfied most of what I was looking for. I’ll be coy about its identity: let’s just call it NoteHub. It has the same concept of promoting habitual learning and self-improvement by building a library of snippets of text. The difference is that, rather than an offline app that you populate with your own notes, it is also a content platform with quasi-social features. I became an avid user, to the point that the developers once contacted me for a user research interview.

Inevitably though, I had some dissatisfactions with the app:

  1. I started worrying about my library of content being locked-in to the platform. Of course they weren’t mine to begin with (I don’t have copy rights), but I eventually amassed what I thought is a valuable collection of ideas, and there was nothing in NoteHub that promoted any kind of archiving or data portability.
  2. The developers kept on making substantial changes to the interface, often to turn up the gamification, and it became somewhat frustrating to use
  3. Although the app did provide a basic, random fetch feature, I found that I wanted a more sophisticated method of resurfacing content.

Serendipitous learning

So I continued thinking about my own app concept, and I blended in more related ideas about creativity and learning that I was also preoccupied with around the same time. Regarding methods of resurfacing content, I’d become a believer in the value of randomness. The idea has many preachers: Tiago Forte, of “Building a Second Brain” fame, is one of the most prominent, and he has sung many praises about the power of random resurfacing to spark serendipity and creativity. Most recently I’ve also seen the concept mentioned in author Robin Sloan’s newsletter.

My experience with NoteHub, however, showed me that pure, uniform randomness isn’t enough. Serendipity is nice, but I had also been learning about the learning process itself, and wanted to somehow incorporate the principles of deliberate & targeted practice, continuous & spaced training, and active reflection into the resurfacing function. These are lofty ideals, and a serious effort at forging these into an algorithm would be worthy of an academic dissertation; in my own, modest project, I was satisfied with playing around with code until I came up with the Leornian Drill, and in justifying some design decisions based on the aforementioned principles (for example, the Leornian Drill deliberately presents only one piece of content at a time instead of, say, an infinite scroll, hopefully to encourage slow but active reflection).

Explore and exploit

I had started to read a book about machine reinforcement learning (RL), and one of the key concepts about RL is the exploration/exploitation trade-off, which can also be applied to human learning: at any point in time we are either exploring our environment (our “action space”) in search of more effective knowledge and “policies” (programs of action), or we are exploiting our existing knowledge and policies to reap the expected rewards; but not both, at least not simultaneously.

I thought about how this concept related to NoteHub; I realized I used the app two ways: I was either in exploration mode, reading new notes from the feed of other users’ content, or I was in exploitation mode (or, at least, practice mode), reviewing and revisiting my existing collection of saved content. I decided that the app I’m going to build should also have this content platform aspect, so that users can have, within the app, an environment in the reinforcement learning sense, a source of novelty.

At this point, I deemed the app concept solid enough. It had shifted a little towards learning in the educational sense, but the original ideas of coaching and self-improvement, I realized, are really just forms of learning as well. Because of this, and also because “Quoach” sounds cheesy, I took the new name Leornian, the Old English word from which “learn” comes from.

Building and learning

Many parts of Leornian, for example those that conform closely to the create-read-update-delete (CRUD) model, were a breeze to build, or at least involved straightforward dev work. Django’s “batteries included” feature set and the Python-Django package ecosystem were a godsend. I still took time to think carefully about aspects like proper MVC/MTV architecture implementation and Django app reusability; these were the considerations I often had to sacrifice in projects in the name of meeting deadlines.

I planned on, and still plan to, use htmx, which has become something of a darling in the Django developer community, as our savior from “JavaScript fatigue”. But I decided to defer its use until I’ve completed a minimum-viable-product feature set for the project, and then I surprised myself by just how much functionality I could implement with good old HTML and only the slightest sprinkling of JavaScript.

Development started to get more challenging, unsurprisingly, when I started to work on the domain logic. Developing the content resurfacing function had me reaching for Jupyter Notebook, although once I had defined the Leornian Drill algorithm, its implementation with Django was still relatively straightforward.

Generating more requirements

What really expanded the scope of work was the decision to build the content platform features. This is shorthand for saying that the app is to be a public, multi-user server software that accepts content submissions, and that these user contents would be accessible by the different user accounts. This is a setup that all major web frameworks, Django included, cater to, and in fact it’s probably the default application architecture that developers think of when they think about web apps. However, what is often not mentioned is that, if the implications of user-submitted contents are taken seriously, then this feature entails a lot of considerations and added work, starting with features required to address content safety.

The big caveat is that this matters only if the app or platform can be reasonably expected to have a significant number of users. That is frankly not the case for Leornian. Without content platform features, developing the app would have been a lot simpler and faster, but even with platform features, I could just wing it and not worry about problems that present only at scale.

Nevertheless, I decided to take the path of greater resistance, to view the project as if it were a serious product (which arguably makes it one). One reason is that the content platform features justify the use of a server framework like Django, instead of desktop or client-side frameworks, albeit in an ex post facto fashion. But more importantly, Leornian is, again, a learning exercise, one taken in the spirit of Nicole Tietz-Sokolskaya’s invitation to “write more ‘useless’ software”.

The challenges of designing a public web app—something that I haven’t quite been able to take on at work before—were irresistible to me. Perhaps it is my background in industrial engineering and IT service management showing, but I’ve always thought of my attraction to holistic, systematic thinking-through as an advantage in my work as a software engineer. I can look at software through an organizational and process engineering lens, and I believe that this is an important skill for software engineers to have even if it is more associated with product and operations management; even if, for many programmers, it is not as sexy as working with the shiny new technology-of-the-month.

Non-technical considerations

So what were these non-technical considerations exactly? One is content safety, which had me reading about content moderation, and learning about the various approaches available. Content moderation is a hard problem, more social than technical, which means it requires social solutions more than technical ones, although the technical solutions are still essential. Mastodon is one social network that has reputable moderation features, so I looked into its system to help me determine the minimum feature set I needed for Leornian.

Content moderation brings with it additional requirements, including the need for a support system. For Leornian this became necessary because the content reporting system requires an account to use, to ensure reporting accountability. But since one possible moderation action is suspension (or even deletion) of an account, this means that in some cases, the affected users will be deprived of the ability to interact with the moderation module, to appeal or otherwise communicate with moderators—necessitating an anonymous support system.

Closely related to content safety is spam countermeasures. On this aspect, I only want to point out that I have experienced dealing with a spam attack enabled by a sign-up form that lacked spam protection, underscoring the importance of evaluating all public entry points to a web app and employing appropriate defense mechanisms, balancing security and user experience while being guided by a proper threat model.

I also had to think about copyright, and the lifecycle of user content. Because one of my issues with NoteHub was the risk of lock-in, I knew I wanted to employ an open model for Leornian; early on, I decided on Creative Commons (CC) licensing for all user content. This posed interesting design questions, such as: does CC licensing mean users can be disallowed from deleting the content they submit? How can the need for retaining content (to preserve other users’ content collections) be balanced with authors’ control over their submissions, and their data privacy rights? The CC license gives authors the right to have attribution removed from their published works: how can this be implemented in Leornian, while retaining content integrity? All these questions and complications meant that the basic CRUD content lifecycle cannot be implemented straightforwardly.

Fortunately, there is also an existing service with similar concerns that I can model after: Stack Overflow, which also employs CC licensing for user content. Studying Stack Overflow’s model validated my design: for example, in Leornian, you can delete your notes only if the note is not saved in any other user’s collection; in Stack Overflow, you can delete your submitted questions only if it has not been upvoted and has no accepted answer, among other criteria. There were still many potential scenarios I had to work out, including for example the risk of the attribution-removal feature as a vector for spam (this is the reason that attribution data is internally retained temporarily after the user has requested removal).

Return to pragmatism

Having worked out all these considerations, learning much in the process, I still needed to switch back to a pragmatic mindset, or else I wouldn’t have produced any working code at all. Thoughtful planning is important, and taking time to do it carefully at the start of a project makes a critical difference, but we know it is impossible to perfectly anticipate all outcomes, no matter how thoroughly the analysis is done, and that beyond a certain point, analysis becomes wasteful. This is why the agile methodology exists, why they tell us “YAGNI”; ultimately there are certain lessons that can only be learned through iteration, by going through the cycle of building, releasing, and evaluating, even (or especially) for a software project that is primarily a learning exercise.

So I arrested the analysis paralysis and scope creep, and prioritized completing a minimum feature set that made Leornian work well according only to my original, personal needs and wants; all the content platform aspects are secondary, and almost optional, and needed only rudimentary implementations. Concretely, this meant that, for example, I decided to built a support module that is really only a contact form and basic emailing system. If this were a product by a commercial organization, it would probably be wiser to integrate a scalable help desk SaaS offering, but in Leornian’s case I estimated that building a naive solution into the Django monolith would mean simplicity, and therefore lower required maintenance effort, in the long run.

Pragmatism itself is perhaps one of the most important lessons I learned from working on Leornian. It took me too long, but I think I’ve started to truly absorb principles like YAGNI and the mantra “make it work, make it right, make it fast, in that order”. Maybe I just had to scratch that itch first, to dive deep into optimization and over-engineering adventures, before I could see them as the misguided efforts they often are.

Humane design

In passing, I’d like to mention that there are some principles that I thought about while designing Leornian that are barely reflected in the current product, mainly because, at its current scale, they are not possible to implement or evaluate meaningfully. These are the aspects that I would like to see more in all the software of the world, but especially social media and web platforms: namely, I wish that all platform developers would be more thoughtful about optimizing for user engagement, more conscientious about using variable reinforcement, habit formation, and other psychological techniques in software design, and more considerate, accepting, and respectful of human limits. The fact that developers go against these principles is a wicked problem, and needs an entire different discourse.

Future plans

I have been using Leornian myself for weeks now, and am quite happy about it, even if it’s only at a minimum-viable-product level of development. Some of the much-needed but deferred features I would like to work on next are tagging/sets, and some form of search (could be naive, or full-text).

I should probably also make efforts in promoting the web app, perhaps to friends, and actually getting feedback. This post is a first step, although its length is not, as they would say, optimal. (But do we really have to optimize everything all the time?)

If ever Leornian gains a substantial number of users, it would be an interesting challenge to develop some areas, such as the Discover algorithm, which is currently only a basic random SELECT.

But until that happens, I will just hack on Leornian at a leisurely pace. If you’ve read this far, I thank you, and I hope you try Leornian and find it useful.

Previous: