# Writing, not collecting

I’ve written a bit about my note-taking on this domain because I think systems of organizing knowledge are quintessential for anyone that is a learner or knowledge worker. To be frank, I find it quite ridiculous that in my university education I’ve not learned how to manage my knowledge base, not even basic academic skills like managing references efficiently. I’ve too often repeated the pattern of following a course, learning for an exam, and not being able to reproduce everything even six months later. (I’m painfully aware of the “forgetting” part, as I took a decade of university courses …)

Nobody has a perfect memory, so I don’t blame myself for not recollecting everything I’ve been taught throughout the years. But the thing is that for each course and for each paper, I did make extensive notes to compensate for my human limitations. All these notes are scribbled down in barely organized old notebooks or in docx files (!), organized in folders scattered across various back-up drives. They are recorded, yet I do not remember. How come?

This little example shows a core paradox of note-taking. I took notes to record insights and knowledge in order to remember. But I had no productive way of applying the recorded knowledge and hence, I forgot. My old notes are locked in a dead past, instead of being a living memory.

This points to perhaps the most dangerous pitfall of note-taking. It’s very tempting to convince yourself you are learning just because you are writing down - in the sense of passively recording - what someone else says or writes. What we ultimately should care about is being able to use our knowledge to produce something new, whatever that may be. To not merely reproduce you must understand the material. And understanding requires application, a hermeneutic principle that particularly Gadamer worked out extensively. If you really want to measure your level of understanding, you should try to apply or explain something to yourself or someone else.

I distill this insight into a very simple principle for note-taking:

notes are things where I explain something to myself.

This also means, somewhat counterintuitively, that note-taking is not about collecting notes! The note is a tool for thinking. It is about tickling the primordial soup of the archive to stimulate yourself towards producing something, anything really. It is to write in the active sense of the word.

The Zettelkästen system embodies this change in mindset. Instead of focusing on folder hierarchies, organizing, and categorizing, focus on interlinking ideas and traversing the network of notes. This allows you to enter the archive at any place and follow traces, of which some may open very surprising lines of thought!

I committed to a flat file directory for all my notes. I’ve also stopped using the tagging system . Rather than helping me find and create ideas, I found myself managing and refining the tags. That is, even though the idea was about interlinking, it de facto ended up as a futile attempt at organization. This type of organization is simply not scalable and future proof. In a few years, you’ll change your tagging conventions and will have to battle the desire to revisit all your other notes to update the archive. The same holds with folder organization: for another project, you’ll prefer another organization and you’ll be dissatisfied with your old solution. You’ll probably end up with a new directory, copying old notes over and duplicating them. If you notice this urge to organize, take a step back and do nothing!

This desire to archive and organize comes from the fear of losing notes. But as we’ve said before, recording something does not prevent you from losing it. You lose it when you don’t actively use it. If you mainly rely on interlinking notes for finding notes, it can indeed be that there will be many ghost notes without incoming links. But this probably means that what was written in the note, wasn’t that useful after all! It’s OK! On the other hand, notes that are productive will solidify their position in the rhizomatic network, because increasingly many paths lead to and from them. In the end, you’ll actually remember what’s in these notes because their insights are used and engage with other ideas.

I know it’s scary. I know it’s hard. I still succumb to Archive Fever once in a while. But when you do, you must explain to yourself and repeat as a feverish mantra:

Writing, not collecting…

Writing, not collecting.

# Site update: Breadcrumbs, taxonomies, paginators

I have just added some new features to this website.

This took a bit of puzzling. To get all the breadcrumbs, you essentially want to parse all the URL components of the current URL relate to the base domain. The Hugo split function returns an empty head and tail, which I needed to filter out.

<div id="breadcrumbs">
<!-- Remove empty first and last element -->
{{ $url := .RelPermalink }} {{$components := last 2 (split (delimit (split .RelPermalink "/") "," "") "," ) }}
{{ $counter := 0 }} <!-- The breadcrumbs --> {{ range$components }}
{{ $counter = add$counter 1 }}
<!-- Following line is strictly not necessary anymore
because I filter out the empty elements above -->
{{ if gt (len . ) 0 }}
{{ if eq $counter (len$components) }}
> <a href="{{ $url }}">{{ humanize . }}</a> {{ else }} > <a href="/{{ . }}">{{ humanize . }}</a> {{ end }} {{ end }} {{ end }} | <!-- Navigation Pages --> {{ range .Site.Menus.main }} {{ if ne$url .URL }}
<a href="{{ .URL }}">{{ .Name }}</a>&nbsp;&nbsp;
{{ end }}
{{ end }}
</div>


## {{ define “main” }} and baseof.html ¶

I started this website without understanding much about … anything really. By now I have a better grip of the Hugo language. I initially separately defined a header and footer partial in all my templates, which amounts to a lot of unnecessary repetition. Hugo avoids this bad pattern by allowing you to define a baseof.html template, as such:

<!DOCTYPE html>
<html lang="en">
<body>
<div id="content">
{{- block "main" . }}{{- end -}}
</div>
{{ partial "footer.html" . }}
</body>
</html>


Now we no longer have to repeat the header and footer templates. Instead, other templates are responsible for filling in the main block with {{ define "main" }} ... {{ end }}. The next section shows a full example.

## Section index pages ¶

Having proper index pages for sections is not hard to do at all and yet, I never had them. This is because I did not fully understand how Hugo treats _index.md pages in the content organization. But once I did understand, I still didn’t get them to work. Turns out I had disableKinds enabled for sections in my config.toml. I can’t remember at all that I enabled this and it took way too long to figure out this prevented Hugo from generating index pages.

{{ define "main" }}

{{ partial "paginator.html" . }}

<div>
{{ .Content }}
<ul>
{{ range .Paginator.Pages }}
<h1><a href="{{ .Page.Permalink }}">{{ .Page.Title }}</a></h1>
{{ .Content }}
{{ end }}
</ul>
</div>

{{ partial "paginator.html" . }}

{{ end }}


In addition to the archive , which has a template of its own, the posts now have their own page where you can scroll through all the posts.

## Paginator ¶

As you can also see in the above snippet, I also introduced a paginator in the section index pages. You can use the default Hugo paginator with {{ template "_internal/pagination.html" . }}, but I built a simple custom paginator instead:

{{ $paginator := .Paginator }} {{ if gt$paginator.TotalPages 1 }}
<div style="text-align: center; font-size: 1.5em; margin: 1em;">
<!--{{ template "_internal/pagination.html" . }}-->

<!-- First page. -->
<a  href="{{ $paginator.First.URL }}"> 0 </a> </li> <!-- Previous page. --> {{ if$paginator.HasPrev }}
<a href="{{ $paginator.Prev.URL }}"><--</a> {{ end }} {{ range after 1$paginator.Pagers }}
{{ if eq . $paginator }} [ {{ .PageNumber }} ] {{ end }} {{ end }} <!-- Next page. --> {{ if$paginator.HasNext }}
<a href="{{ $paginator.Next.URL }}">--></a> {{ end }} <!-- Last page. --> <a href="{{$paginator.Last.URL }}"> N </a>
</li>
</div>
{{ end  }}


The paginator doesn’t appear if there is only one page. The “next” arrow also does not show when there’s no next page. Be aware that I used inline css styling, because I’m bothered by my browsers/Hugo not picking up changes in the static style.css file in time.

## Series taxonomy ¶

In the AI Ethics Tool Landscape I heavily made use of taxonomies, which made me realize I should also do this on my personal website. I added a new series taxonomy. I also wrote a terms.html template that does not only lists the taxonomy terms, but also counts how often they occur:

{{ define "main" }}

<div>
<h1>{{ .Title }}</h1>
{{ .Content }}
{{ range sort .Data.Terms }}
<li><a href="{{ .Page.Permalink }}">{{ upper .Page.Title }}</a> ({{ .Count }})</li>
{{ end }}
</div>

{{ end }}


## Date of last modification ¶

This is a really cool Hugo feature I wasn’t aware of before. If you keep your Hugo site in a git repo (which you should anyways), the Hugo build can pick up the latest commit made on the current file in order to show the date of the last edit. It’s as simple as:

<aside>
</aside>


# Introducing the AI Ethics Tool Landscape

Many companies that employ AI by now recognize the need for what the European Commission calls an “ecosystem of trust” 1. This has resulted in a large amount of (ethical) principle statements for the usage of AI. These statements are often to a large degree inspired by pre-existing guidelines and principles, such as the European Commission’s Ethics Guidelines for Trustworthy AI 2 or the SAP guiding principles for AI 3. However, there are doubts about the effectiveness of these guidelines. A main challenge for the coming years is to effectively operationalise AI policy. In this post I introduce an open source project that hopefully makes a step in the right direction by listing available tools to support the ethical development of AI.

The tools are described and organized in the AI Ethics Tooling Landscape . The website itself and the corresponding README explains how the website works and how to contribute to it.

## Where guidelines fall short ¶

AlgorithmWatch initiated an AI Ethics Guidelines Global Inventory 4 in which more than 160 AI ethics guidelines have been collected. The organizers of this initiative noted that the overwhelming majority of guidelines contained general promises, for example to not used biased data, but no concrete “recommendations or examples of how to operationalise the principles” 5. Evaluating the guidelines in this repository a year later, AlgorithmWatch states that often “it is a matter of weaving together principles without a clear view on how they are to be applied in practice. Thus, many guidelines are not suitable as a tool against potential harmful uses of AI-based technologies and will probably fail in their attempt to prevent damage. The question arises whether guidelines that can neither be applied nor enforced are not more harmful than having no ethical guidelines at all. Ethics guidelines should be more than a PR tool for companies and governments” 6. Because most ethical guidelines do not indicate “an oversight or enforcement mechanism” 5 they risk being rather toothless “paper tigers.” Ethics does by nature not include any mechanisms for its own enforcement and the active promotion of ethics guidelines is even seen by some as an attempt to preemptively stifle the development of more rigid AI legislation that does enforce compliance 7. This suspicion that companies use ethics guidelines as a PR tool can lead to accusations of ethics washing 8.

## A call for operationalisation ¶

This shows that despite a proliferation of principle statements — the “what” of ethical AI — there is not enough focus on the “how” of ethical AI. Papers that review the operationalisation of AI principles point out that currently these principles do “not yet bring about actual change in the design of algorithmic systems” 9. In a similar vein, Hagendorff points out that “it is noteworthy that almost all guidelines suggest that technical solutions exist for many of the problems described. Nevertheless, there are only two guidelines which contain genuinely technical explanations at all — albeit only very sparsely” 7. Equally stern, Canca concludes that “the multiplication of these mostly vaguely formulated principles has not proven to be helpful in guiding practice. Only by operationalizing AI principles for ethical practice can we help computer scientists, developers, and designers to spot and think through ethical issues and recognize when a complex ethical issue requires in-depth expert analysis” 10.

These citations from review papers show that there are serious doubts, to put it mildly, about the effectiveness of guidelines. These guidelines are great public statements, but how do we make sure they actually impact the work being done in professional AI communities? A main challenge for AI ethics and ethical AI in the coming years is therefore “to build tangible bridges between abstract values and technical implementations, as long as these bridges can be reasonably constructed” 7.

## Bridging the gap ¶

This call for concretisation and operationalisation of AI principles does not mean, however, that AI principles can always and straightforwardly be technically implemented in a tool or computer program. This tension between abstract principles and concrete tools requires thoughtful navigation. From the perspective of principles, there is a strong call to make them as concrete as possible so that they can be used by the professionals actually creating AI applications. In this context, Hagendorff calls for a “microethics” 7 that engages with technical details, such as the way algorithms are designed, or how data is used throughout a machine learning pipeline. From the perspective of existing tools and techniques, it should instead be emphasized that they are not “plug-and-play” solutions to make an AI application ethical. Instead, appropriate use of these tools requires an ethical sensitivity, with attention to the context in which an AI technique is embedded and used. For example, if you use a fairness tool, which definition of fairness is used and is this definition appropriate given the type of application you are developing?

## The AI Ethics Tool Landscape ¶

To these ends, I did an explorative study of the available tools and techniques for ethical AI. The target audience that would use these tools are developers, so I wanted to tailor the format in which I presented my insights to developers. The majority of the tools are technical, but the value “accountability” includes non-technical tools, that nevertheless meet the criterion that they provide hands-on guidance for developers to engage with ethical AI. Instead of writing a large document on these tools — which no developer would ever read — I decided to program a website from scratch in the style of a wiki.

The website is called the AI Ethics Tooling Landscape and is set up as an open source project. The website and corresponding README explains how the website works and how to contribute to it. Specifically, because the project content is completely based on simple text files that are automatically parsed, little to no technical know-how is required to contribute content.

Each tool is placed in a conceptual taxonomy, which I based on 1) existing typologies and taxonomies for categorizing (machine learning) tools 911, 2) insights from my study of e.g. explainability and fairness, 3) things that are very practical to know for developers, such as the programming language of the tool or which frameworks are used. For example, Morley et al. used a matrix with ethical values on the x-axis and the development stage in which a technique applies on the y-axis. Similarly, the wiki categorizes tools by the value they support (accountability, explainability, fairness, privacy, security) and the stage in which they are useful (design phase, preprocessing, in-processing, post-processing). I fine-tuned the conceptual taxonomy incrementally based on feedback from developers. Morley et al. use more stages, but this results in an extremely sparse matrix. Based on feedback that the stages were indeed a bit too complicated, I came to the current broader stage categories, which are also used in the fairlearn tool.

However, due to the digital design of the website, I was not limited to a 2D matrix. Tools are also categorized on the following properties: whether they are model-agnostic or -specific; which type of data the tool handles; which type of explanation is supported (if applicable); which type of fairness is supported (if applicable); which programming framework is compatible (e.g. PyTorch); which programming languages are supported; and which machine learning tasks are relevant.

An interesting observation is that the tooling landscape is not uniformly distributed. Significantly more technical tools are available for the values explainability and fairness than for privacy and security, which reflects that the research field on topics like differential privacy and adversarial AI are relatively young.

At the time of writing, the website project contains 41 custom-made tool entries. Each of these entries contains metadata to categorize the tool, as well as my description of the tool with relevant information. All values, stages, explanation types, and fairness types have their own guiding descriptions as well.

1. European Commission. White Paper On Artificial Intelligence - A European approach to excellence and trust. Tech. rep. Brussels: European Commission, 2020, p. 27. ↩︎

2. High-Level Expert Group on Artificial Intelligence. The European Commission’s high-level expert group on Artificial Intelligence: Ethics guidelines for trustworthy AI. Tech. rep. European Commission, Apr. 8, 2019, pp. 1-39. ↩︎

3. Corinna Machmeier. SAP’s Guiding Principles for Artificial Intelligence. Sept. 18, 2018. url: https://news.sap.com/2018/09/sap-guiding-principles-for-artificial-intelligence/ (visited on 08/04/2021). ↩︎

4. AlgorithmWatch. AI Ethics Guidelines Global Inventory. url: https://inventory.algorithmwatch.org/ (visited on 03/16/2021). ↩︎

5. AlgorithmWatch. Launch of our ‘AI Ethics Guidelines Global Inventory’. Apr. 9, 2019. url: https://algorithmwatch.org/en/launch-of-our-ai-ethics-guidelines-global-inventory/ (visited on 08/04/2021). ↩︎

6. AlgorithmWatch. In the realm of paper tigers - exploring the failings of AI ethics guidelines. Apr. 28, 2020. url: https://algorithmwatch.org/en/ai-ethics-guidelines-inventory-upgrade-2020/ (visited on 08/04/2021) ↩︎

7. Thilo Hagendorff. “The Ethics of AI Ethics: An Evaluation of Guidelines”. In: Minds and Machines 30.1 (2020), pp. 99-120. doi: 10.1007/s11023-020-09517-8. ↩︎

8. Elettra Bietti. “From Ethics Washing to Ethics Bashing: A View on Tech Ethics from within Moral Philosophy”. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. FAT* ‘20. Barcelona, Spain: Association for Computing Machinery, Jan. 2020, pp. 210-219. doi: 10.1145/3351095.3372860. ↩︎

9. Jessica Morley et al. “From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices?. In: Science and Engineering Ethics 26.4 (2020), pp. 2141- 2168. doi: 10.1007/s11948-019-00165-5. ↩︎

10. Cansu Canca. “Operationalizing AI Ethics Principles”. In: Commun. ACM 63.12 (Nov. 2020), pp. 18-21. doi: 10.1145/3430368. ↩︎

11. Alejandro Barredo Arrieta et al. “Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI”. In: Information Fusion 58 (2020), pp. 82-115. doi: https://doi. org/10.1016/j.inffus.2019.12.012. ↩︎

# Digest July 2021

This month BADBADNOTGOOD released a single in anticipation of their upcoming album Talk Memory (dropping beginning of October). The track takes you on a 9 minute long gloomy journey that stays dynamic and tense from beginning to end. It is sometimes hard to distinguish the instruments from dark electro and the solos are at no point middle-of-the-road. The rather enigmatic video clip features DJ Steves. Piece of art, really.

The best discovery of this month is the new collaboration between guitarist Blake Mills and bassist Pino Palladino, the full-length album “Notes with Attachment”. This album is so unique and it gets better with every listen. The Tiny Desk concert is a great teaser for the whole album and it’s fascinating to get to see the artistry from up close, especially because the instruments sound quite alien at times. There’s sections where I wouldn’t have believed I’m listening to a saxophone. Blake Mills is the kind of guitarist where every note is supposed to be there, condensed to perfection. That holds generally for this whole album.

I watched Bo Burnham’s new musical comedy special Inside last week and it was brilliant. We see Burnham’s turn into a Jesus figure during a year inside in his own guest house, musically addressing the ins and outs of a life forced to be lived online and his declining mental health.

Tip from my housemate Maria, thanks! Psychedelic and you can dance to it, what’s not to like. New single to the upcoming album “City Slicker”:

This is not a new song, but I’ve been listening to this cover by Jordan Rakei a lot:

I’ve learned about Emma Ruth Rundle relatively late and for the next collaboration I’m also a year late to the party. Caution: We are ending on a heavier note:

# Mnemonic for closed-form Bayesian univariate inference with Gaussians

The following note helps me remember the closed-form solution for Bayesian inference using Gaussian distributions, which comes in handy very often. See Bishop p.98 (2.141 and 2.142) for closed-form parameter updates for univariate Bayesian inference using a Gaussian likelihood with conjugate Gaussian prior. Let’s start with the rule for the posterior mean:

$\mu_{new} = \frac{\sigma^2}{N \sigma_{0}^2 + \sigma^2} \mu_0 + \frac{N \sigma_{0}^2}{N \sigma_{0}^2 + \sigma^2} \mu_{MLE}$

Where we know that the maximum likelihood estimate for the mean is the sample mean:

$\mu_{MLE} = \frac{1}{N} \sum_{n=1}^N x_{n}$

See the bottom of this post for a derivation of the maximum likelihood estimate for the mean. Notice that the mean update has the following shape, balancing between the prior mean and the data sample mean:

$\mu_{new} = \lambda \mu_{0} + (1-\lambda) \mu_{MLE}$

With

$\lambda = \frac{\sigma^2}{N \sigma_{0}^2 + \sigma^2}$

I find this form a bit hard to remember by heart. We can also think of this weighting factor $$\lambda$$ as the prior precision divided by the posterior precision. This is not immediately obvious, but we can already see in the formula for $$\mu_{new}$$ that, as the posterior precision increases, the posterior density becomes more concentrated around the maximum likelihood solution for the mean, $$\mu_{MLE}$$, since then $$\lambda$$ diminishes and $$(1-\lambda)$$ will be larger.

Let’s first write down the posterior variance from Bishop and see how we can use it to support above intuition:

$\sigma_{new}^2 = ( \frac{1}{\sigma_{0}^2} + \frac{N}{\sigma^2} )^{-1}$

This formula is much easier to remember in terms of precision $$\tau = \frac{1}{\sigma^2}$$:

$\tau_{new} = \tau_{0} + N \tau$

Which intuitively reads: the posterior precision is the prior precision plus N times the precision of the likelihood function. The multiplication by N shows the again intuitive result that the more observations you make, the more “certain” the posterior distribution becomes.

It’s not immediately obvious that that $$\lambda = \frac{\tau_{0}}{\tau_{new}}$$ so let’s write it out:

$\frac{\tau_{0}}{\tau_{new}} = \frac{\tau_{0}}{ \tau_{0} +_N \tau }$ $= \frac{ \frac{1}{\sigma_{0}^2} }{ \frac{1}{\sigma_{0}^2} + \frac{N}{\sigma^2}}$

Multiplying both sides with $$\sigma_{0}^2$$ gives:

$= \frac{1}{ 1 + \frac{N\sigma_{0}}{\sigma^2} }$

Multiplying both sides with $$\sigma^2$$ gives:

$= \frac{\sigma^2}{ \sigma^2 + N\sigma_{0}^2 } = \lambda$

QED.

This results suggest that it’s efficient to first compute the new variance, and then use this variance to compute $$\lambda$$ in the formula for the posterior mean.

So if you want to easily remember the parameter update rules, in natural language:

• The posterior precision is the prior precision plus N times the likelihood precision
• The posterior mean is a mix between the prior mean and the mean of the observed data sample
• The mixing coefficient of the prior mean is the prior precision divided by the posterior precision which we called $$\lambda$$
• $$(1-\lambda)$$ is the mixing coefficient of the sample mean.

## MLE for the mean

First find the expression for the Gaussian log likelihood:

$ln p(D|\mu, \sigma^2) = ln \prod_{n=1}^N \mathbb{N}(x_n | \mu, \sigma^2)$ $= \sum_{n=1}^N ln \mathbb{N}(x_n | \mu, \sigma^2)$ $= \sum_{n=1}^N ln( \frac{1}{\sqrt{ 2\pi \sigma^2}} exp{ - \frac{(x_n - \mu)^2}{2 \sigma^2}})$ $= N ln( \frac{1}{\sqrt{2\pi \sigma^2}}) - \sum_{n=1}^N \frac{(x_n - \mu)^2}{2 \sigma^2}$ $= -\frac{1}{2 \sigma^2} \sum_{n=1}^N (x_n - \mu)^2 - \frac{N}{2} ln(\sigma^2) - \frac{N}{2}ln(2\pi)$

The next step is to find $$\mu_{ML}$$ by deriving with respect to $$\mu$$. The derivative w.r.t. $$\mu$$ is:

$\frac{1}{\sigma^2} \sum_{n=1}^N (x_n - \mu)$

Setting to zero gives the MLE:

$\mu_{ML} = \frac{1}{N}\sum_{n=1}^N x_n$