On the Eve Before Dawn
When I started working on EvE end of November, I didn’t expect it to take this long. I knew that it would be quite an ordeal, but the optimistic side of me still, at the time, underestimated the challenges to come, both technically and emotionally. Now, on the brink of its completion, I can finally stay away from the code editor and discuss some of the more nuanced yet significant aspects. I’ve found writing notes like this the only way to organize my constantly chaotic and conflicting thoughts.
EvE is an encoding program for Dawn Objects that goes under the umbrella of Renaissance. It’s to date by far the most ridiculous thing I’ve ever written and the first of its kind to attempt at the ultimate problem of natural object representation (see previous notes on my foolish Trinity Problem/Leibniz). With Renaissance EvE, media (for now) resource objects are constructed in a way that can be identified and verified authentically, directly and independently, solely with sensory information. Note that, here, these Dawn objects aren’t merely files (codes) but conceptual and abstract entities. A photograph in Dawn is not a copy of a jpeg file loaded with RGBA and meta data (or chunks of hexadecimal strings), but a specific existence that has such data as its possessions/extensions. As a matter of fact, in most cases, the “corporeal” part (the jpeg file) of a Dawn object is not even kept in Dawn at all. This shift in the nature of subject matter is more than rhetorical; it is the foundation that guarantees absolute uniqueness and immutability, both necessary conditions for Dawn to exist and operate. In order to acknowledge and, at the same time, emphasize this dichotomy, just like how it has always been done conventionally, I am giving this “hoisted” existence a name, Ling, which will be further elaborated on in a while.
As mentioned in the beginning, EvE is from the Renaissance family and adheres to its naturalistic and non-exhaustive tenets that guide all aspects of the program development. Therefore, it exploits an existing model in the real world, instead of an engineered one trained on growing data. Although such design is most likely out of my own personal beliefs, it does have a practical (up)side — making resources easier to manage and, more importantly, distribute in a decentralized fashion. We will revisit this subject when we discuss the algorithms of EvE.
In short, EvE builds and maintains a special representation layer for Dawn resource objects. A not-so-proper analogy will be comparing it to crafting and giving unique faces to human beings so that we can all look different to be different from a third-person perspective. Imagine on a planet where everyone looks the same. How can we tell one from another? We might have to do DNA test every time we run into each other or ask countless questions for verification (“What’s your birthday?”, “What’s the name of your first pet?”, etc.). It sounds comical but this is indeed how life in our beloved digital world is. Therefore, while adding a functional dimension to conceptual objects, in practice, EvE can also be seen as an identification mechanism that feeds on raw input data, similar to but not the same as, for example, Google’s image reverse search. I will circle back to this use case later.
Ling and Indiscernibles
In Tao Te Ching, Laozi says:
Literal translation: From Tao there is one; one two; two three; three everything.
It’s a statement hailing from 400 BC on the origin of everything and arguably the most important one in Taoism. Of course, the one, two and three here are not numbers. However, the everything is indeed everything.
Six hundred years later, Ge Hong, a linguist, philosopher, physician and politician, offered his own thoughts on the “everything”:
Literal translation: Mountains, water, grass and trees, or wells, stoves, pools and ponds, they all have their own spirits.
Rather than seeing inanimate objects as just objects, Taoism believes that all things, from a stone on the top of Himalaya to a broom in the corner of your house, have their own energies and spirits (灵, Ling).
The ancient sages would’ve never thought or, honestly, cared that two thousand years later someone would apply that doctrine to a computing-based system, but that’s exactly what Dawn is built for — to host, serve and nurture the Ling of everything. Of course, for now, it’s limited to digital creative content; Dawn for tangible goods will differ subtly on a philosophical level due to the nature of indivisibility, among other things, and a lot on the technical side. Although just the thought itself is exciting and mind-blowing enough, it would be long, if ever, before I get to work on computing tangible goods in the physical world. As a matter of fact, we can even go beyond the product category and achieve far more in the upstream.
Another principle fundamental to Dawn objects and their Lings is one formulated by Wilhelm Gottfried Leibniz. I idolize Leibniz and can go on and on about the many fascinating things he did, such as inventing this the stepped reckoner or designing this windmill. However, just like I see da Vinci as an engineer and scientist more than an artist, Leibniz, to me, was more of a thinker than a mathematician. Calculus was probably simply a worthy byproduct of his brilliant mind while mulling over life, God and mechanics. I often find solace and joy in his (austere) writings as if I were talking to someone as lonely as I am, though from hundreds of years ago. One of his lesser known ideas, compared to the more famous ones such as “the best of all possible worlds,” is the Identity of Indiscernibles. The principle holds that there are no two objects that have exactly the same properties.
If, for every property F, object x has F if and only if object y has F, then x is identical to y.
It’s controversial, but I am not here to debate that. The absolute adoption of the Identity of Indiscernibles is critical to the entire Dawn complex and precisely what makes EvE possible because Dawn Objects shall and must be categorically and unconditionally unique. The principle has a number of implications that impact Dawn and is more complicated than it appears to be. They will be mentioned in the rest of this article.
Although it is rare to put Laozi and Leibniz together, Leibniz himself probably wouldn’t find it too strange as he’s rather interested in Taoism himself when not too many people in his time even heard of it. But, again, what was he not interested in? Romance, maybe?
In Dawn We Live
It is stated above that Dawn Resource Objects have “Lings,” but the better explanation here will be that Dawn provides an environment where the Ling of a resource object is recognized and therefore the object, as a whole, can have a continuous identity throughout lifecycle, especially between states/sessions where its data files are duplicated, deleted or modified. The ancient concept of Ling may remind you of many things, especially if you are into metaphysics like me. Descartes, for one, may easily come into mind with his mind-body dualism. In some scenarios, that model seems to hold true. The chair I am sitting on right now is obviously a body (extension) and its Ling, likely not very happy about my weight right now, can be analogous to the mind. Of course, Descartes wouldn’t accept that my chair could think and have ideas. In the end, this was the guy that did this. I am not equating Ling with Descartes’ soul either, but it satisfies the comparison for our discussion here. However, things get a bit tricky when we enter the digital territory and there’s not much to reference. As a matter of fact, the ontological nature of digital objects, as a subject, is largely and urgently missing from the public discourse. While a whole new generation is enthusiastically redefining “being” and all types of identities, it seems that digital entities, a concept that has only existed for less than half a century, are simply accepted as-is without any hesitation. To some extent, it is understandable since, in most cases, digital objects serve a purely functional purpose on the application level, a Word document to read, a video to play or a piece of software to install. It also helps that we perceive them to be in a different world that we have control over. In the worst scenario, we can just pull the plug to make the troubles go away — that’s what my mom believes and would do every time. This is not the case when one first sees a microwave. He or she would seriously wonder what it really is, how it works and whether it is safe to use. Such thought process on the nature and the existence of an object itself rarely happens with chunks of binary digits lying in the middle of nowhere. Does it matter? It obviously does to Dawn as this cognition is the basis for establishing the “Lings” and the objects further in the system. However, even outside Dawn, with the virtual economy on the rise, this discussion may seem more relevant than ever. K-pop, after sweeping the world with generations of young and fabulous talents, is introducing virtual idols that will NEVER age, gain weight or forget lyrics. Unlike video game or anime avatars, these idols, modeled after real persons in the most unrealistic way, are meant to take on a true unique identity in the real world that is believable enough to attract fans. For someone that lives in the jungle (but with Wi-Fi), the difference between Jennie from Blackpink that only exists in viral Youtube videos and Jenny the virtual idol that dances flawlessly on his or her phone can be smaller than one presumes. Here, it seems inevitable and even logical to derive and acknowledge some type of continuous identity from a set of constantly changing computer files that power virtual idol Jenny. This identity is a special case of Ling as it is designed with a specific agenda to imitate human lives. However, if we can accept that, then it is fairly reasonable to further apply it in general for the difference between the fundamental nature of these virtual idols and that of other boring computer files is minimal, if any at all.
As mentioned above, Dawn recognizes such Ling for all resource objects. Other than my own personal belief that has undeniably played a minor role here, Ling is why and how Dawn objects survive and sustain, in the first place. Unlike the tangible goods that remain relatively stable and consistent in their forms, digital files are constantly changing. Every time we open a plaintext file, its meta is going to be modified. Strictly speaking, it becomes a different file. Every time we move or delete a file, again, technically, it is a new file. It becomes more complicated when other machines are involved. If we stream a video file, lol.mkv, hosted on one server to thousands of clients in millions of small packets of bits, are the thousands of downloaded copies of lol.mkv the same file? If not, what are they? You might argue that I confuse actual files with overarching concepts, but that’s exactly why Ling is the more significant part of resource objects as well as Dawn; it not only addresses but also fills the gap between the two. Furthermore, as the files are subject to change, in some way, Ling isthe object that contains a set of files as the content body (“attribute”). Therefore, although it is tempting to imagine a mind-body relation between Ling and data material, Dawn objects are closer to the ideas of Spinoza than those of Descartes.
Establishing Ling as the primary subject of a resource object is key to the Dawn identity scheme. When we search for something on the Internet, the results that come back are not possible matches of the object but instances of the objects that resemble or are related to the original object, based on the available object data. It is easier to demonstrate with examples. If we try to look up an image of a dress (via Google reverse search), what we are actually getting is the web pages Google has indexed where an image of a dress that shares similar features can be found. While it is vastly useful and efficient, its reliance on the platform (Google search engine) algorithms can lead to problems that grow over time (more sibling copies added). In some cases, it can turn dangerous because the platform, no matter how ethical the algorithms are engineered, is almost always going to favor larger sites with more traffic. With the major search engines being the de facto Internet, resources from smaller and independent sources are at a distinct disadvantage. This bias mirrors that of our modern-day market model abused by corporatism in general. Going back to our photograph example, it is much easier to locate the image on a popular site than the original photographer’s personal website, even if the image on the popular website is an unauthorized copy. The resource owner has very limited power in managing access once he or she makes it available because digital content is easy to duplicate and edit. In fact, digital content is made for duplication, which is the only way for it to get transferred or disseminated. It is not a new practice. However, unlike how the artisans/churches back in the old days spent years producing replicas of famous art pieces to meet local demands, anyone nowadays can use computers to generate identical copies quickly and effortlessly. If the photograph is copied 10000 times and uploaded to 10000 websites by 10000 users under 10000 names, then at the end of the day how do we answer even the simplest question — what is it? It could get worse. Criminals have been harvesting personal photos from social media and turning them into a lucrative business for a long while. If you’ve ever publicly shared a photo of yourself, chances are it’s in some random datasets. Oh, don’t worry — it’s already too late. From a platform’s perspective, making judgements on copyright infringement or, worse, fraud and scams, is a challenging task. However, the opposite scenario is likely more dire. Given how fast-growing and well-funded the machine learning field is, before long we are able to have ultra-realistic computer-generated content that easily pass the eye test. Someone can then fake a batch of photos of anyone based on the real photos for profits or other criminal activities. Sadly, in the current system, the resource owner must entirely trust and depend on platforms to have an honest identity arbitrarily assigned to their objects. The same picture named g3849_90uI.jpeg on BestPhoto dot com is gg6674rZ.webp on WorstPic dot com. Neither carries any substantial meanings and, essentially, they represent two separate objects that the original owner can no longer own or disown.
A shift from such platform-dependent construct to a more decentralized distribution that supports and thrives on the empowerment of independent resources directs the development of MOFFAS, though slowly and, to be frank, painfully. MOFFAS, shamelessly named after my big ginger cat at first, stands for Mutuality-Oriented Free-Form Allocation Scheme. In practice, it acts as a resource distribution mechanism that concurrently serves Dawn, the resource side, and LOVN, the consumer side, injecting a dynamic, fluid and intelligent market space in instances — a market only exists when it is needed, with communication between the two independent parties going through GEN . Therefore, Ling and the assumption of the Identity of the Indiscernibles not only are important for Dawn but also determine whether MOFFAS as a market system can operate effectively for the ad-hoc one-to-one relationship will cause connected clients to explode when there are too many duplicates or be nullified when contractions and conflicts occur.
A Renaissance Eve
So far, I’ve argued for the importance and necessity of recognizing and establishing a unique identity for each Dawn resource object. In this section, I am going to briefly go over the technical aspects of Renaissance EvE, the program that represents and converts this unique identity for application-level programmatic access.
Renaissance is being built to address some of the challenges that arise from a more decentralized Internet, such as data integrity, identity and provenance. One way to explain its purpose is to imagine maximum paranoia, like how philosophers often use Demon in thought experiments. If some powerful evil alien force has the ability to interfere with all information systems in the world, how can people at the end of the reception channel filter and validate the responses they receive? We trust the articles that we fetch on New York Times website to be genuine because the data is sent by the servers maintained by New York Times. If the invincible alien technology can easily intercept data packets during transmission or taint the source on the server side, how can the individual users know whether to accept or reject the results? Renaissance provides its own mechanism: each Dawn Object carries a computed seal that is readable and verifiable on the receiving end. What sets it apart from traditional methods, such as hashes and checksums, is that it is “meaningful” by itself and thus human friendly. As of now, seals fall under two categories based on the medium of the enclosed data file: EEC and PBE. For text subjects, seals are generated by EEC, named after my favorite American poet, E. E. Cummings. As suggested by the name, these seals are indeed well-known poems that match the target texts after both get processed by a simple algorithm. The use of common “knowledge” and common sense is how EEC prevents manipulation because these poems have their permanent and unchangeable “Lings” in the world and, for some of us, in our mental space as well. Unless aliens can manufacture global Mandela effects and change how all of us remember Shakespeare’s Sonnet 18, I suppose that we are still safe. Likewise, PBE (short for my favorite painter Pieter Bruegel the Elder) protects image-based objects with a similar approach. The seals are some of the most famous paintings from mainly the Renaissance and Baroque periods. Thanks to the often ignored power of common knowledge (and common sense), the algorithms for both types of Renaissance Seals can afford to be simple and the ledger, as a result, is nimble enough for easy distribution. Meanwhile, on the client side, since the poems and paintings are small flat files, they can be hosted locally; even in the unlikely events that the local files are corrupted, it is easy to query them on the Internet.
While Seals work on data accuracy and integrity (top-down process) to detect delinquencies such as tampering and deepfakes, EvE, as mentioned in the beginning, is built for identification of Dawn resource objects (bottom-up process). At the moment, EvE primarily works with image and video-type resource objects. Text, on the other hand, surprisingly turns out to be both the easiest and the hardest medium to work with.
The mainstream strategies for searching image data tend to revolve around feature extraction and vector embeddings, which also guide the development of image-related machine learning fields, from face detection to object recognition. Powerful as these strategies are, they are hard to deploy in a decentralized environment due to the sizes of the underlying models/networks (in all fairness, vector embeddings do have a fixed size representation). Moreover, the models are constantly learning and updating, requiring close supervision and maintenance. A significant consequence these limitations lead to is the increasing disparities as only large corporations can afford to have such technologies and take advantage of their power, which, in turn, creates more disparities. It is the reason why I’ve repeatedly raised concern about the exhaustive computational approaches, such as Bitcoin mining and brute force encryption. Luckily, EvE is not a machine learning program and our much simpler problem here can be ultimately reduced to array comparison. To comply with the laws of Renaissance, EvE employs a real-life model, like the sonnets for EEC, as the basis for such comparison. What it means is that instead of creating a model with feature engineering for vector similarity search, EvE tries to find a middle layer that already exists and is therefore bounded to match A-AB-B. The latter would give us a boolean value as the result, instead of an array sorted on cosine similarity, because EvE is about the question of who you are, not how many people are out there that look like you.
While I always had the natural model in mind, to rebuild that into a workable layer that can digest input data hasn’t been easy. In fact, it took four frustrating months and is still less than satisfactory. The basic setup is intuitive. Depending on how the input data is collected, EvE provides two access points for queries: controlled environments and natural scenes. Controlled environments are where data is extracted by a browser or submitted by an application. For both images and video files, EvE can provide some tolerance for distortion and therefore withstand tests like watermarks, captions, light recoloring and even minor trimming. Note that if an object’s file is substantially edited, then it is a different object in Dawn according to the Indiscernibles rule. On the other front, natural scenes are significantly more difficult to process since data is likely collected by a camera-enabled app and external interferences are unpredictable, with some of the more challenging ones being lighting, hand trembling and screen reflection/refreshing. In order to achieve better results, a different set of algorithms have to be employed for natural scenes. Compared to those for controlled environments that take a global view, these algorithms are more locality-focused and match feature keys sporadically. In the end, how long can you hold a camera for when you want to look up a video? Inevitably, such methods come at a performance cost, but the only way to definitively improve that, I think, is to work on better hardware devices. It makes no sense that we cannot have the technology to facilitate our five senses, especially auditory and visual, in the near future. On a side note, the obstacles faced in natural scenes are the main reason text-based targets are hard to process. While you can capture the entire image or video frame in one event, it is hard to do so with long articles formatted into fancy layouts. However, I am working on the hybrid type of text and graphics, which will be useful for processing and distributing the likes of magazine editorials.
There are a number of types of image-based data that EvE and PBE do not work well with in particular, a result of both my limited skills and the principle of the Identity of Indiscernibles. As the algorithms heavily rely on color contrast, if an image is too generic (a video of polka dots throughout) or lacking in details (a photo of color blocks), it will get rejected. However, these limitations are at the same time necessary and, even, useful as they force a quality test on Dawn objects — that the data must be substantial.
As of now, Dawn’s priority is creative content in the image and video categories. However, the Renaissance Seals for videos use the same PBE method that is used on still images. It has plenty of drawbacks, but the most frustrating one is that it can be extremely time-consuming. Ideally, I would find a way to “seal” video files with videos.
In the meanwhile, cutting back on the dependence on “external storage” is naturally the next step for Dawn in general. For now, after identifying a Dawn object, a user-end client (or our own LOVN) will fetch information attached to the object from a Dawn storage node (“Lighthouse”). However, this process can be inconvenient and potentially insecure, despite carrying a Renaissance Seal. I’ve started to work on better methods to mathematically store such information within the objects or, strictly speaking, with their Lings directly.
As Dawn begins its loading process, one may wonder how they can ever spontaneously find a Dawn object from a top-down process, similar to a traditional search/suggestion engine. That’s the job for GEN, a mechanism to enable smart communication and matchmaking between Dawn and LOVN. While Renaissance focuses on the identity of Dawn objects or what is it, GEN is all about the “personality” or what is it like. We will come back to it next time.