Just calling it Open is not enough (English Version)
- Just calling it Open is not enough - Challenges of public education infrastructures and how Nostr could help
Just calling it Open is not enough - Challenges of public education infrastructures and how Nostr could help
I would like to share with you the concepts I am working on to make public education infrastructure more accessible and open with the help of Nostr. I work in the field of public education infrastructure, especially in the field of Open Educational Resources (#OER). OER are openly licensed educational materials that are provided with an open license, usually a Creative Commons license (CC-0, CC-BY, CC-BY-SA). The clear and open licensing makes it easy to adapt the learning materials to individual needs, improve them and republish them.
For many years, the development of free educational materials has been promoted on the one hand, and platforms, in particular repositories, that are intended to make these materials available have been promoted on the other. After all, these materials have to be made available somewhere so that they can be found.
However, this only works moderately well.
Challenges
After many years of funding, the simple question: "Where can I make my OER material available?" cannot be answered easily. There are services where I can upload my OER, but it then remains locked in this platform and cannot be found on other platforms. In addition, these services are often tied to specific educational contexts or only release content after a quality check. This means that simple and public sharing is not possible.
These and other challenges have their origin in the fact that service and infrastructure are mixed up unfavorably in the architecture of public education architecture. I understand infrastructure here as the provision of a public and openly accessible educational infrastructure on which data can be exchanged, i.e. provided and consumed. However, such an infrastructure does not currently exist independently of the services that are operated on it. Infrastructure operators are currently always service operators at the same time. However, as they want to have control over what exactly happens in their service (understandably), they also restrict access to their infrastructure, which leads to them replicating lock-in mechanisms of large media platforms in the small public education infrastructure.
It's a bit like every car manufacturer building the roads for their cars at the same time. But only for their cars.
Using a few examples of services that existing platforms offer on their infrastructures, I would like to highlight the challenges that I see in the current architectural concept:
-
Upload of educational material
-
Curation: compilation of lists, annotation with metadata
-
Crawling, indexing and searching
-
Cross-platform collaboration in communities -> example: quality assurance (whatever that means exactly)
-
AI services -> Example: AI-generated metadata for educational material
Material upload
The "material upload" service or the sharing of a link to educational material is provided by various OER platforms (wirlernenonline.de, oersi.org, mundo.schule).
In concrete terms, this means that if I upload content to one of the platforms, the content usually remains there and is not shared with the other platforms. The result for users: I either have to register everywhere and upload my material there (leading to duplicates) or live with the fact that only the users of the respective platform can find my content.
The Open Educational Resource Search Index (OERSI) addresses this challenge by providing the metadata for educational materials from various platforms in an index. This index is in turn publicly accessible so that platforms can also use it to consume metadata from other platforms. That is already very good. However, it only works for platforms that OERSI indexes and not for all others. The OERSI is focused on the higher education sector, i.e. other educational contexts are excluded here. The approach of placing a suitable "OERSI" next to it for each educational sector scales poorly and the challenge remains that a corresponding importer/crawler must be written for each source that is to be indexed.
This approach (pull approach) lags behind the materials.
However, there are even more limitations: The platforms have each specialized in specific educational contexts. This means that the question "Where can I make my OER available?" always has to be answered with the counter-question "For which educational sector?". If this is outside the general education sector or outside higher education, let alone outside the institutional education framework, things get very, very thin. In short:
- It is not easy to provide OER so that it can also be found on different platforms.
Curation
By curation, I mean the compilation of content in lists or collection-like form as well as the annotation of these collections or the content with metadata.
Some platforms offer the option of categorizing content in lists. However, these lists are not portable. The list that I create on platform A cannot be imported to platform B. But that would be nice, because it would make it easier to expand the lists on other platforms or even make them collaborative, while avoiding lock-in effects.
Various centralizing factors occur when annotating with metadata. In current practice, metadata is usually defined at the time the content is provided. This is usually done by a person or editorial team, sometimes with the support of AI services that assist with metadata entry. But how do you add your own metadata? How to communicate that this material would be great not only for biology, but also for sports in topic XY? The current approaches cannot fulfill this requirement. They do not utilize the expertise and potential of their users.
-
There are no interoperable collections
-
Metadata annotation is centralized
-
Users cannot add their own metadata
Crawling, indexing and searching

Since users do not want to visit many different platforms and websites to search for suitable content, the "big" OER aggregators crawl them to index the metadata of the content. Via various interfaces or sometimes via the raw HTML. The latter crawlers are very time-consuming to write, error-prone and quickly break when the website design is changed, while the former are somewhat more stable as long as the interface does not change. The use of the General Metadata Profile for Educational Resources (AMB) has improved the situation somewhat. Some platforms now offer a sitemap containing links to educational material, which in turn contain embedded script tags of the type application/ld+json, so that the metadata can be imported from there.
Example: e-teaching.org offers a sitemap for its OER here: https://e-teaching.org/oer-sitemap.xml and there is a corresponding script tag on the respective pages.
This is already much better, but there is still more to do:
Initially, this approach is only practicable for platforms and stakeholders who have the IT resources to be able to incorporate the relevant functionalities. Teachers cannot simply implement this on their private blog or similar. On the other hand, there is still a discovery problem. I still need to know where to look. I need to know the sitemaps, otherwise I won't find anything. Instead of an approach in which actors can independently communicate that they have new content (push approach), we are currently pursuing an approach in which each platform acquires content for itself using the pull method. This leads to duplication of work in many places, is inefficient (several people build exactly the same crawlers, but always for their platform) and excludes small players in particular (is it worth programming a crawler if the website "only" provides 50 materials?)
Instead of sharing developed data, the platforms work for themselves or at most make it available again behind their own (open or closed) interfaces. That's probably not what we imagine an open and collaborative community to be, is it?
When it comes to searching, we face similar challenges to those described above. Although various OER aggregators in the form of repositories or repositories already index many of the "smaller" platforms and thus offer a comprehensive search, it is not possible to search these aggregators together. Ultimately, this means that users have to go to different platforms again if they want to search the entire OER pool.
-
In many places, content is duplicated, but always for the user's own platform
-
There is no shared data space into which stakeholders can "push" content
-
There are no cross-platform search options
Cross-platform collaboration
That would be nice, wouldn't it? It's a mystery to me how #OEP (Open Educational Practices, exact definition by the community is still pending) is supposed to work without it. But as far as I know, there are not even any approaches as to how this should be implemented technically (or is there? let me hear).
One scenario for such cross-platform collaboration could be quality assurance. Let's say that two platforms / communities have agreed on something that they call "quality", but how can this seal of approval be applied to the content?
Platform A: Well, then everyone come to us. We can do it here and then we can hang a nice badge on the materials.
Platform B: Yes, but then it won't be on our materials. Besides, we want/need to work at our place, because what is the point of my platform if we do everything at yours?
- Although everyone is now talking about #OEP, there are no technical approaches to how (cross-platform) collaboration can be mapped technically
AI services
What is already complete today without mentioning AI? At least for the next funding application, something has to be done with AI...
Various projects are developing helpful and impressive AI services. For example, to facilitate the annotation of content with metadata, to automatically add metadata, to find content on specific topics or to (semi-)automatically add content to collections. But (maybe you've already guessed it): It only works on your own platform. Presumably because the services are developed close to the platform's own data model. And since the data does not leave this silo, it fits. This leads to the same services being developed twice in several places.
- AI services often only work on the platform for which they are developed
Summary of the problems
We already do a lot of things very well (use of the AMB, open educational materials, we have a great community) and now we just have to go further.
(The OER Metadata Group, which developed the General Metadata Profile for Educational Resources (AMB), does not receive any direct funding for its work. At the same time, it is a central point of contact for all those who handle metadata in open educational infrastructures and the metadata profile is one of the few application profiles that is publicly accessible, well documented and offers validation options).
If we take a bird's eye view of the entire platforms and the challenges described, we can distinguish three intertwined core components that help to better understand the problems described:
-
User
-
service
-
Data
Users: Users are active on (almost) all platforms. They upload material, annotate with metadata, are in a community, search for content, etc. Regardless of whether they can/must log in, we offer our users something so that they can hopefully derive added value from it
Service: This is the something. The "website", the interface, the place where the user can click and do something. It is what often gives the data a "visual" form. The service is the intermediary, the interface between the user and the data. With the help of the service, data can be created, changed or removed (there are of course many non-visual services that allow interaction with data, but for most normal people, there is something to click somewhere).
Data: The information in structured machine-readable form that can provide added value to the user in rendered form through a service. It is difficult for us to capture unrendered data (we are not Neo). This can either be the metadata for educational materials, the materials themselves, profile information, collections of materials or similar.

In my opinion, many of the challenges described above have their origins in the fact that the three core components of user, service and data have been unfavorably combined. This is not an accusation, because this is exactly the way platforms have always been built in recent years (decades?):
- Users, service and data are bundled into one platform
This means that through my service, the users interact with the data and I can ensure that everything works well together in my little world. This makes sense if I'm Microsoft, Facebook, X or something similar, because that's exactly my business model: Locking users in (lock-in), taking away their sovereignty over their content (or can you migrate your Facebook posts to X?) and, if possible, not letting them out again.
But our projects are public. These are not the mechanisms we should be replicating. So what now?
Educational infrastructure structures based on the Nostr protocol
Nostr
In 2019, a concept for a social media protocol "Nostr - Notes and Other Stuff Transmitted By Relays" a pseudonymous person by the name of "fiatjaf" described as follows:
It does not rely on any trusted central server, hence it is resilient, it is based on cryptographic keys and signatures, so it is tamperproof, it does not rely on P2P techniques, therefore it works.
Fiatjaf, 2019
The core components of the protocol consist of:
-
JSON -> data format
-
SHA256 & Schnorr -> cryptography
-
Websocket -> data exchange
And it works like this:
Users have a "key pair": a private key (which you keep for yourself, just for yourself) and a public key, which you can show around, this is your public identity. You use it to tell other users: Look here, that's me. The two keys are connected in a "magical" (cryptographic) way: The public key can be generated from the private key, but not the other way around. This means that if you lose your public key: no problem, it can always be restored. If you lose your private key: bad luck, it is virtually impossible to recover it.
However, the key magic goes even further: you can sign "messages" with your private key. This signature, which you create using the private key, has a magical property: anyone can use the signature and your public key to verify that only the person who also has the private key for this public key could have signed this message. Magical, right? Don't understand completely? No big deal, you're probably already using it without realizing it. It's not a fancy new technology, it's well established and widely used.
Remember: users have a key pair and can use it to sign messages.
Then there are the services. Services basically work as described above. They allow users to interact with data. But with Nostr it's a little different than usual, because: The data doesn't "live" in the services. But where then?
When a user creates, modifies or removes a data record, this "event" (as we call it at Nostr) is signed with your private key (so it's clear to everyone that only you can have done this) and then sent to several "relays". These are the places where the data is stored. When a user logs into a service, the service retrieves the data it needs from these relays. User, service and data are therefore decoupled. The user could switch to another service and retrieve data from the relays. No lock-in possibilities.
Note: User, service and data are decoupled.
Finally, there are the relays. Relays are locations. They are the places to which the events, i.e. the users' data, their interactions, are sent and from which they are requested. They are like the backend of Nostr, but they don't do much more than that: accept events, distribute events. Depending on the configuration, only certain users are allowed to write to or read from a relay.
The basic design of the protocol is based on openness and interoperability. No registration is required, only key pairs. The authenticity of an event can still be ensured using cryptographic procedures, as only the owner of the respective key pair could create this event in this way. The relays ensure that the data is sent to the desired locations and as we use more than one, we have a certain degree of reliability. As the data only consists of signed JSON snippets, we can easily copy it to another location in the event of a failure. The signatures also ensure that no changes have been made to the data in the meantime.
Example: A nostr event
Here is a small technical digression that describes how Nostr events are structured. If you are not so interested in the technical details, feel free to skip this section.
Every Nostr event has the same basic structure with the attributes
-
id: The hash of the event
-
pubkey: The pubkey of the creator of the event
-
created_at: The timestamp of the event
-
kind: The type of the event
-
tags: Additional metadata for the event can be stored in this array
-
content: The textual content of an event
-
sig: The signature of the event to check the integrity of the data
{
"id": <32-bytes lowercase hex-encoded sha256 of the serialized event data>,
"pubkey": <32-bytes lowercase hex-encoded public key of the event creator>,
"created_at": <unix timestamp in seconds>,
"kind": <integer between 0 and 65535>,
"tags": [
[<arbitrary string>...],
// ...
],
"content": <arbitrary string>,
"sig": <64-bytes lowercase hex of the signature of the sha256 hash of the serialized event data, which is the same as the "id" field>
}
The event types used and the existing specifications can be viewed at https://github.com/nostr-protocol/nips/ or on [nostrhub.io](https://nostrhub.io/]
It is also important to note that you can simply start developing applications. The relays will accept all events that follow the above scheme. So you don't have to ask anyone for permission or wait until your specification has been accepted and added.
You can just build things.
Excursus: Nostr for binary data - Blossom
Yes, but... this is only suitable for text-based data? What about binary data (images, videos, PDFs, etc.)?
This data is often quite large and the best practice has been to not store this data on relays, but to find a more suitable publication mechanism for these data types. The approach is called "Blossom - Blobs stored simply on mediaservers" and is quite straightforward.
Blossom servers (nothing more than simple media servers) use Nostr key pairs to manage identities and sign events. The blobs are identified by their sha256 hash. Blossom defines some standardized endpoints that describe how media can be uploaded, how it can be consumed and so on.
The details of how authorization and the respective endpoints work are described in the aforementioned specification.
Nostr 🤝 Public education infrastructures
How could challenges be solved if we use Nostr as the basis for public education infrastructure?
Material upload
- It is not easy to make OER available so that it can be found on different platforms.
With Nostr as the base infrastructure, the metadata and binary data would not be linked to the service from which it was provided. Binary data can be hosted on so-called Blossom servers. Metadata, comments and other text-based data are distributed via the relay infrastructure. As data and service are decoupled, the OER materials can be consumed from different applications.
Curation
-
There are no interoperable collections
-
Metadata annotation is centralized
-
Users cannot add their own metadata
Collections are interoperable per se. How lists work is defined at protocol level. Annotation with metadata is not centralized at any point. The promise of the RDF community "Anyone can say anything about any topic" is realized here. I don't have to listen to everything. Maybe I only consume metadata events from certain editorial teams or users. Maybe only those with a proximity to my social graph. In any case, there is the option of providing metadata for all users.
Crawling, indexing and search * Content is duplicated in many places, but always for your own platform * There is no shared data space into which actors can "push" content * There are no cross-platform search options
No more duplicate indexing. If a user has published a metadata event in the network, it can be consumed by everyone. The data space is shared per se. Cross-platform search is made possible by the combination of relays and NIPs. Special query formats for the respective NIPs can be defined in the NIPs. Relays can indicate which NIPs they support. In Nostr, a cross-platform search is a cross-relay search.
Cross-platform collaboration
- Although #OEP is now on everyone's lips, there are no technical approaches as to how (cross-platform) collaboration can be mapped technically
Nostr is the technical approach.
AI services
- AI services often only work on the platform for which they are developed
In Nostr, there is the concept of data vending machines (see also data-vending-machines.org. So instead of just building an API (which is already very nice if it is openly accessible), these services could also act as actors in the Nostr network and accept and execute jobs. The nature of the jobs can be described in a specification so that the functionality is easy to understand for all interested participants in the network.
The services could even be monetized, opening up opportunities to develop business models.
Conclusion
The open education community is great. They are unique and incredibly committed people who have dedicated themselves to the noble goal of "Accessible Education for All" -> "Open Education". We use Creative Commons licenses -> Commons -> Commons. It's okay that many projects are dependent on sponsors and funding. What we do is in the spirit of a commons: public education for all. So we all pay for it as a community.
What is not okay is that what we have all paid for can no longer be found after a short time. That it is locked away. In publicly funded data silos. It must also be available to everyone in the long term. Otherwise it is not accessible, not open. Then the O in OER is just a label and marketing to get money for 3 years for an ABM measure. Because content development is nothing else if the content is thrown away after three years.
And the same applies to OEP. Open learning practices are also just a phrase if we don't think about the right technical infrastructure that enables real openness and collaboration and thus the implementation of open learning practices.
And if we don't think about adapting the infrastructure for open learning now, then we will probably be able to see in a few years what will be left of it in the event of political reorientation. If the funding pots are cut completely, what will be left of the money invested?
We need solutions that committed communities can continue to operate and that can't have their heads chopped off without us being able to put two new ones next to them.
We need to think about this now.
How open does public education infrastructure want to be?