It's pretty magical to enter a string of text into a program and be presented with a unique image that (often) resembles the spirit of the prompt. Even though I’m not the “make pictures” type of artist, I’ve been asked several times if I think the future of image making is in AI image generating technologies like Dall-E 2 or Midjourney. Questions about technology and legitimacy have always played a role in the art world, particularly around the tension between original works and reproductions, which have shifted alongside new methods of production. Further, questions about the utility of technology-aided tools are part of a much longer trajectory, from Luddites who viewed textile machinery as deceptive evasion of labor, to the Catholic Church’s opposition to the printing press.
When we ask Dall-E to produce a person, place, or thing, where does it get that information? OpenAI uses data from millions of public repositories to train its models, many of which are annotated by extremely underpaid labor. Although some have felt like AI is sentient, the ideas and images emerge from collective, unverified sets of coded data that decide whether an image is a “person,” or a “black person,” a “flower,” or a “Polyantha rose.” These decisions are based on how that data is manually classified, creating an embedded informational bias underpinning Dall-E’s “imagination.” This is not just related to image production; these datasets underpin decisions about health and medicine, employment, creditworthiness, and even criminal justice.
Although these technologies are both fascinating and terrifying, I think it's important that we dissect and understand them rather than taking them at face value. To me, one of the biggest risks of this technology is to imagine it is objective, neutral, or simply mindlessly performing tasks. As cyborg theorist Donna Haraway reminds us: we should always be wary of the gaze that appears to be from nowhere.
The most clicked link from last week's issue (~7% of opens) was the 3.6 meter-tall octobass that can play very, very low notes.
This week, the Members' reading group will be discussing Stuff Matters with author Mark Miodownik. The book is an accessible celebration of material science and has sparked interesting conversations on the fascinating (and a little gruesome) world of joint replacement.
Planning & Strategy.
- Dall-E creates a link between prompts and visual representations using contrastive language-image pre-training (CLIP). In order to train a CLIP, images and their associated captions are passed through encoders. Then, the similarity of each set of images and text is computed, aiming to maximize an association between correctly encoded image/caption pairs and minimize the incorrectly encoded image/caption pairs. While we’ve quickly become accustomed to seeing images output from these tools, CLIP Interrogator reveals the inverse. The tool uses CLIP to generate a good prompt from an image input, and some of its results are pretty absurd!
- Amazon’s micro-work service Mechanical Turk uses humans to perform small tasks that computers can’t, such as annotating images and videos for AI data sets. The original Mechanical Turk was an automaton created by Wolfgang von Kempelen in 1770, which appeared to be a mechanized chess playing robot. It was later revealed that the robot was a hoax and there was in fact a person inside the machine making moves using a magnet under the board, and pulling levers to maneuver the automata. The Turk toured Europe and was exhibited until the death of Kempelen, after which it was sold to Bavarian musician Johann Nepomuk Mälzel. In 1809, Mälzel’s reincarnation of the Turk, secretly chess master Johann Baptist Allgaier, famously beat Napoleon in a match. Napoleon allegedly was curious and in good humor about it, attempting several illegal moves to test the limitations of the Turk. Kempelen is also known for his extremely unnerving Speaking Machine.
- The Connections Museum’s Youtube channel has great videos on the history of telecommunications and electromechanical telephone mechanisms. My favorite series is about producing telephone progress tones, like ringing, busy, or dial tones. The first was Thomas Watson’s “Thumper” in 1877, which was essentially just a spring-loaded hammer mechanism that would hit the receiver, notifying the caller there was someone on the line. In the early 1900s, these were replaced by magneto devices, which generate an alternating current to drive more advanced ringers, and had to be manually cranked both before and after calls were made.
Making & Manufacturing.
- These string activated folding surfaces create some really complex forms using computational origami. I love the nautilus shell example! One of my favorite tools to create interesting folds is Tomohiro Tachi’s Orgamizer, which generates folding patterns that can span any shape. Here is a list of other computational origami-related tools.
- The Mechanical Turk's success can be partly attributed to the wild popularity of automata in the 1700s. One of my favorites is Jacques de Vaucanson’s Canard Digérateur, or Digesting Duck. The device was about the size of a real duck and coated in gold-plated copper. It would quack and splash water with its bill, then take food from its operator's hand, swallow it, and excrete what appeared to be a digested version of it. Vaucanson called it a chemical laboratory, but in reality, it was also a hoax. Of somewhat less importance, Vaucanson also invented the first all metal lathe in 1760. Thankfully, in 2002 Wim Delvoye took up the idea of a digestion machine again, building the Cloaca Machine, which was much bigger than a duck.
- How a player piano works.
Maintenance, Repair & Operations.
The history of medical devices contains some pretty horrifying yet impressive techniques. While the intent behind ancient cranial drilling isn't entirely understood, the earliest evidence of the procedure dates back to 4000 BC. The procedures were also largely survivable with 75–83% of Incan patients showing evidence of a healed skull. This survival rate exceeds outcomes for head trauma patients during the American Civil War. Modern cranial perforator drill bits are single-use and use a clever mechanism to avoid drilling into the brain, which disengages the spinning bit as soon as it passes through the skull.
Distribution & Logistics.
In 1610, Galileo Galilei wrote the Sidereus Nuncius (the Starry Messenger, translated to English in 1880), a treatise that contained his early observations about stars and planets. This text was very controversial because it contradicted the beliefs of the time, popularized by the Catholic Church and Aristotle. Up until this point the Moon and celestial objects were believed to be perfectly round and made of quintessence. First proposed by Aristotle as a fifth element, quintessence was theorized to fill the region of the cosmos outside Earth. The concept was used to explain several phenomena, such as the traveling of light and gravity. This was the first text where Galileo was in major conflict with the Catholic Church, he was later ordered to turn himself in to the Holy Office for suggesting the Earth revolved around the sun.
Inspection, Testing & Analysis.
- Facial tracking surveillance equipment is common in public spaces, inspiring artists to experiment with face detection blocking techniques. Jewelry, masks, or projections can obscure the wearer and trick the software. This “invisibility cloak” takes a different approach, and is essentially a shirt covered in patterns designed to look like nothing remotely identifiable by machine vision algorithms, rather than blocking recognizable parts of the wearer.
- Until about the 1980s, astronomers classified and cataloged stellar spectra by photographing the cosmos using a huge curved mirror on a telescope, which collected the nebular light, and slowly exposed emulsion on a photographic glass plate. The Harvard College Observatory has a collection of more than half a million glass plates from the last 100 years, many of which are still being mapped to images taken by Hubble and Webb.
- Polarized films are often used in microscopy to analyze birefringent materials, which are polarized across two different planes. By using an additional polarizing filter, beautiful interference patterns can be seen in crystals or insects and isopods. This interactive artwork uses the same technique to create pixel art by altering the direction of polarized films to change the opacity.
- This beautiful helix calendar.
- Jeremy Fielding explains how your perpetual motion machine will never work.
- All this talk of AI reminds me of Microsoft’s 2017 Twitter AI experiment, Tay, who began to make disturbing misogynistic and racist remarks before reaching its 24 hour birthday.
p.s. - I’m trying to get Midjourney to make a 3 eyed dog. If you can do this or want to talk about why you can’t, please email me.
p.p.s. - We care about inclusivity. Here’s what we’re doing about it.