Dropbox has quietly launched a new password manager named Dropbox Passwords. The app is only available in a private beta on Android, and although you can download it, you won’t be able to use it unless you’ve got an invite. The app’s Play Store listing notes that the app is currently “in development” and therefore may be unstable.
The app seems pretty basic in its current state. Like most password managers, it can create unique passwords, store them in one place, sync them across devices, and automatically fill in login fields. There’s no mention of other useful features like importing passwords from browsers and support for two-factor authentication.
It also advertises something called “zero-knowledge encryption,” which means only the...
Let’s talk about Australia’s energy policy failure.
At a time when most of the developed world has woken up to the facts that a) fossil fuels are really bad for people and the environment and b) fossil fuels are more expensive than non-polluting alternatives, the Australian federal government policy continues to shovel enormous subsidies at our failing fossil fuel industry.
Yes, that’s right. Despite having enormous reserves of oil, coal, and gas, a modern(ish) banking sector, and the best solar resource on Earth, Australia continues to operate some of the oldest, most polluting power stations on Earth while enduring some of the highest energy prices in the world. Even small pacific island nations with no local fuel supply have lower energy prices!
Of course, confronted with this continuing cascade of failure, the official response continues to be “more of the same”.
That solar and batteries, installed today, could meet >100% of Australia’s electricity demand at about a tenth of current retail prices, is beyond doubt. Solar energy reached price parity in Australia in about 2011 and has gotten more than twice as cheap since, while the costs of operating Australia’s ancient, poorly managed grid has only grown.
So why hasn’t the policy evolved?
Could it be that Australians are backward, ill-informed people, propagandized by a monopolistic and opportunistic media industry? To some extent. After all, Australia is the only country I know where the entirely fictitious “wind turbine syndrome” continues to get media airtime.
Could it be that our leaders have never been particularly visionary and usually take their cues from industry-financed lobbyists? Sure, some of them, probably. Indeed, some Australian MPs are spruiking this article, a veritable cornucopia of discredited energy ideas.
Don’t believe me? It summarizes the Australian chief scientist Dr Alan Finkel’s call to use coal and gas, combined with carbon sequestration, to produce hydrogen to feed the “hydrogen economy”.
Where does one even start?
Carbon capture and sequestration (CCS) is the idea that maybe we can take CO2 from smoke stacks, compress it and store it somewhere, perhaps underground. This has basically never been practical, as the costs of storing the CO2 will always be higher than the value generated by producing it. This inconvenient detail is often concealed as an excuse to go on burning stuff. Indeed, every CCS project I know of has been a dismal failure, with the most glaring example occurring at the Kemper plant in Mississippi, in 2017. After consuming $7.5b and running years behind schedule, the project was scrapped entirely, with the DOJ later opening an investigation. Now, almost 4 years later, the Australian government’s consultants think they’ve fixed what the hopes and cash of the entire dying US coal industry couldn’t? If you believe that you’ll believe anything.
And that’s not even getting to the “hydrogen economy,” the 1980s-level out-dated idea that in the future we’ll use hydrogen instead of gasoline to power our vehicles. While hydrogen doesn’t produce CO2 during combustion, it is difficult to make, store, and move around as it’s a low density hard cryogen with the antisocial habit of leaking through metal containment vessels. Fifteen years ago Tesla decided to produce electric cars using laptop batteries and by 2008, their performance had already exceeded the theoretical upper limits of hydrogen fuel cell technology, which has never been demonstrated even in a laboratory. No hydrogen-powered car has ever been mass-produced. Nor will they.
In summary, the “technology stack” currently advocated by the Australian government is a fail-centipede of bullshit. It can’t work, it will never work, and it’s not really even meant to work.
Of course, just as coal mining companies continue to talk up CCS, traditional automakers continue to “invest” in fuel cell technology. Both are yardsticks that haven’t moved since I was a child. And both industries are now being utterly crushed by predictable applications of competing technology that could have been foreseen by a disoriented penguin.
Why do they do this? Imagine being a business development executive at General Motors. (I would use an Australian automaker but, oh snap, our elected representatives spent decades watching while our manufacturing sector collapsed. By the end of this blog, you’ll understand how to fix this.) You have to pitch the board on your plan to spend billions developing electric car technologies that will only cannibalize existing sales, drastically depreciate existing tooling, supersede engine manufacturing, and require massive capital outlays. It’s never going to happen. Why sacrifice the next few profitable quarters doing business as usual on a gamble to reinvent the industry, especially when innovation is usually led by suppliers?
So, same old shit, day in, day out.
Traditional automakers’ business models are predicated on expensive service for the existing fleet out of warranty, and are not compatible with developing compelling electric vehicles.
This is how a tiny upstart Californian company was able to eat their lunch to such a degree that it’s gone from “Tesla will die any day now” to “Tesla is a decade ahead of the competition and it’s all over bar the shouting” in about six months.
Would it surprise you to learn that Australian energy policy is basically the same? Let me explain.
Energy is really important. I was taught from birth by my bleeding heart liberal tribe to hate petrol and burning coal, but the reality is that petrol makes cars and trucks and tractors go. Yes, it’s a terrible poisonous dirty corrupt industry, but without it (or any part of it) Australians run out of food and starve in a matter of weeks. Keeping the lights on and the vehicles moving is at the core of Australian security and it’s a serious matter, especially when we depend on foreign imports for literally everything. This unspoken reality, I believe, drives a lot of otherwise peculiar obsession with the similarly-motivated US policy of bringing “peace and democracy” to the Middle East, at least before fracking was commercialized at scale.
That said, mining and exports is basically the only productive part of the Australian economy that’s left. To say we depend on it is an understatement. Every year that the government can avoid making tough decisions on the future of Australian industry is another year we can live off selling our coal to China, while being slowly baked alive by climate change.
I’m going to oversimplify here but there are a few different ways that economies can build wealth. Let’s talk about primary, secondary, and tertiary parts of the economy. Primary production is mining, extraction, farming. Getting bulk raw materials. Secondary production is manufacturing, where raw materials are combined to produce commercial products. And tertiary services are human-to-human tasks where a lot of people are now employed, be it in health, education, tourism, or sales.
The Australian economy is strong in primary production. We have fabulous wealth in basically everything: mining, agriculture, timber, etc. Without this we would probably have an economic situation more like Mongolia. Primary production is great, as far as it goes. But it leaves a lot of wealth on the table.
Australia has a strong services sector, but like any business where wealth is created through one-on-one human interactions, it’s not fit to be the engine of wealth creation. The economy can’t operate without it, but it still needs a strong underlying system to create real things.
Let’s talk about wealth. Wealth is created when humans or machines perform a process that improves the value of something. Normally wealth creation is equated with superannuation or investment, but the other end of that transaction is (hopefully) someone actually doing something useful, which you (the investor) get to profit from, because capitalism.
My favourite example here is an iPhone. An iPhone is about half aluminium by weight, and let’s say it began its life as bauxite (aluminium ore) in northern Australia. As bauxite, it was put on a ship and sold for $0.40, with a margin in the single digit percents. Once smelted into aluminium somewhere in Asia, it was worth about $2.50. Once machined into the right shape its Bill of Materials cost is around $15. The phone costs about $370 to produce and is sold for $999, so the consumer’s cost for that aluminium chassis is $40. The process of mining, smelting, machining, and sales increase the value of the raw materials by a factor of more than 100. Of that $40, $25 went to Apple to cover R&D, shipping, sales, and other overhead. $12 went to the machinist, most of which is needed to cover the cost of the tooling. $2.10 went to the smelter, most of which covers the cost of electricity and carbon electrodes. Finally, the (probably foreign) mining company’s revenue was 40c, its profit was about 2c, and the tax the government earned was less than a cent.
Every year, Australia exports $4.5b of bauxite, which is a decent sum of money for a few big holes in Queensland. After smelting, that aluminium is worth $25b. After its conversion into parts for planes, cars, laptops, and soft drink cans it’s sold for more than $100b. By failing to exploit Australia’s ludicrously cheap solar energy to onshore more material processing, we are giving away nearly all that value for free.
I cannot be more emphatic about this. The failure of Australian domestic manufacturing will destroy our way of life forever.
But, I hear you cry, Australia weathered the global financial crisis just fine. Boomers who own houses are doing fine. If the economy is screwed, why is real estate still out of control?
It’s a bubble. Housing prices are determined by supply and demand. But unlike other commodities, housing demand is extremely inelastic, since everyone needs to live somewhere, and supply is also relatively fixed. As a result, housing prices are a direct function of credit availability. The one and only reason for rising house prices is rising availability of enormous mortgages, enabled by continual deregulation of the rent-seeking banking industry and a government happy to trap the next generation in a lifetime of loan servitude in exchange for a small piece of the action.
Okay, so it’s morally questionable and hugely inconvenient, how does that make it a bubble? The intrinsic value of the underlying asset is fixed. Houses that get ten times more expensive cannot house ten times as many people. They are infrastructure that actually depreciates as they wear out. They don’t produce anything. For my entire life, the bulk of Australian consumer’s discretionary spending has been poured into servicing loans that have starved the rest of the economy for capital, and with predictable results.
It’s not real wealth growth. We can’t sell all those houses and get the money back. We can’t even tax this empty price growth or the impact on demand will crush the market overnight, destroying years of “invested” GDP. Its deliberately undersized impact on the consumer pricing index obscures its true impact as basically inflating away the value of any other part of the economy. This money is gone. The opportunity it could have bought, with steady investment in secondary industries that actually make something, is gone.
Try buying any kind of specialty equipment in Australia. Try to get a sales rep or engineer on the phone who actually knows what they’re talking about. Try walking to a retail electronics store and buying a transistor. There are market booths in Shenzhen with a greater range of selection than every electronics shop in Sydney, combined. This is a big problem. A country that has forgotten how to make anything is strategically vulnerable.
We’ve talked about how materials traverse the value chain, mostly outside of Australia. We’ve talked about how misplaced “investments” in real estate have burned up 30 years of Australian economic output with nothing to show for it. But we haven’t yet covered the single most important source of wealth for any nation on Earth. People.
Every day, people generate wealth. Every single part of the economy depends on people to operate it. To do the work. And while the global west employing a 40 hour working week has had a continuing labor surplus since the end of WWII, a vibrant domestic manufacturing industry can’t be imported. It needs to be built up, in Australia, by Australians. This stuff isn’t taught in school, and probably never was. Building a competitive industrial sector is taught through experimentation in garages and backyards by motivated people who cannot succeed without a critical supply of existing industry knowledge and a fungible supply chain. I don’t believe manufacturing consumer electronics is the answer to Australia’s industry woes, but kids can’t build robots without them.
Knowing that a technologically literate and constructive workforce is essential to Australia’s future, government policy has consistently promoted innovation in education, skilled migration, entrepreneurship, and critical infrastructure, right? Right? Hahahahahahahaha. In just my lifetime, I’ve seen the growth of the complete antithesis of these values, and for no apparent reason.
Got an idea in the US? People will line up to buy you lunch to offer to invest, even expecting that you’ll fail the first three times. Got an idea in Australia? “Nah mate, that’ll never work.” Our universities have been subordinated to the tourism industry where too many places have been sold to non-immigrant foreign full fee paying students spending insane time and money for a piece of paper that is already in the throes of self-inflicted devaluation. Ever tried to get an Australian entrepreneurship visa? It doesn’t exist. And finally, let’s not forget the national broadband network. After blowing incredible sums of money on politically expedient (and wrong) technology, Australia is 50th in internet speed. Many online scammers actually skip Australia specifically because its terrible internet infrastructure costs them too much to be worthwhile. This isn’t a good thing. It’s an international disgrace, and the people responsible should be ashamed.
Let’s sum up. The Australian economy is in deep shit. Primary production ships >99% of the value overseas and, like any commodity business, has terrible margins at the best of times. The bloated real estate sector has sucked the life out of domestic manufacturing and the rest of the economy. Infrastructure sucks. The education system has left a generation literally propagandized into thinking that technology is something that comes from overseas, if and when a foreign sales team is desperate enough to try to corner the tiny Australian market. And on top of this, the best and brightest national leadership continues to shovel what little disposable development cash it has into propping up the fossil fuel export industry, window dressing it as some antique discredited technology to sucker the rubes into agreeing.
It sounds pretty bleak. A few years ago, when I decided to stay in the US, I was convinced that it was only a matter of time until Australia, too, had its moment of reckoning with the inevitable collapse of the housing market taking everything down with it. Argentina 2.0.
Obviously this are huge problems, they can’t simply be wished away. However, through no fault of our own, salvation may be at hand. Or at least an alternative to continuing to dig a deeper hole.
It’s time to begin the process of bringing advanced manufacturing back onshore in Australia. How? China makes stuff cheaper than we could ever hope to. This is true and it’s unlikely to change – economies of scale mean that even with the cost of shipping Chinese factories will make goods more cheaply than Australian factories can.
The answer lies in energy-intensive industries. It may seem odd given Australia’s habit of selling its gas overseas and then acting surprised that energy prices increase, but the right government policies could actually reverse this trend, giving Australian consumers and industry the cheapest electricity on Earth, forever.
How can this be done, and what can it be used for?
In adoption of wealth-building technology, Australia lags most of the world by decades. How is it possible to catch up? Consider the above map. Australia is basically the only (business friendly) western liberal democracy that has huge (developed) mineral resources AND enormous (undeveloped) solar resources. Nearly every competing country has to mine stuff somewhere, and ship it elsewhere to be processed somewhere where energy is cheaper.
With commercially available solar and battery storage technology, Australia could easily deploy a gigawatt of solar production annually, at a combined cycle price of about 2.5c/kWh. That’s 10 times cheaper than current retail prices and less than half the price of electricity in Iceland, which exploits its cheap geothermal energy to smelt enormous quantities of aluminium. Why ship bauxite to Iceland when it can be smelted right by the mine and sold for 6 times the price? Electricity is the main cost of aluminium production. Secure an infinite supply of much cheaper electricity and the rest is pure margin. Usually when we think of a country charging 100 times production cost for mineral wealth we think of Saudi Arabia or Kuwait. In our solar future, Australia can have oil-level wealth while saving the planet.
Solar energy continues to get cheaper by about 10% a year. No matter how good a competing business case may be, it’s only a matter of time before the tech that powers trees crushes it completely. Australia’s current energy production capacity is 66.5 GW. At current prices, 100 GW of solar power would cost about $15b to deploy, which is almost nothing compared to current fossil subsidies when spaced over a decade. This power would be too cheap to meter 99% of the time, driving material processing onshore.
What can you do with nearly free electricity? We’ve discussed smelting aluminium, but there are many other energy intensive processes for which Australia should be the only place on Earth they get done. Mass desalination for irrigation. Comprehensive recycling of basically anything. Hard rock, zero impact mining using tunnel boring machines. Anything requiring refrigeration.
Just because Australia can’t export solar power the way it exports coal doesn’t mean that the wealth embodied by Australia’s enviable solar resource can’t be monetized. Smelted aluminium contains the huge quantities of energy used to create it, and the same goes for anything else where any amount of onshore processing exponentially increases the domestic value creation.
Australia does not deserve to escape the consequences of its consistently small-minded and backwards economic policies. We are incredibly fortunate that foreign investment in solar technology has created an opportunity ripe for the picking. We would be wise to exploit our natural resources to electrically re-industrialize our secondary manufacturing sector before competing nations get around to it. With solar costs falling 10% a year, Australia’s failure to move could cause competing nations to get to low cost onshore mineral processing first, at which point Australia will have blown yet another golden opportunity.
This window, this undeserved path to salvation, will not remain open forever.
I hate paywalls on articles. Absolutely hate them.
A standard pro-business argument: businesses can either make your life better (by providing deals you like) or keep your life the same (by providing deals you don’t like, which you don’t take). They can’t really make your life worse. There are some exceptions, like if they outcompete and destroy another business you liked better, or if they have some kind of externalities, or if they lobby the government to do something bad. But in general, if you’re angry at a business, you need to explain how one of these unusual conditions applies. Otherwise they’re just “helping you less than you wish they did”, not hurting you.
And so the standard justification for paywalls. Journalists are providing you a deal: you may read their articles in exchange for money. You are not entitled to their product without paying them money. They need to earn a living just like everyone else. So you can either accept their deal – pay money for the articles – or refuse their deal – and so be left no worse off than if they didn’t exist.
But I notice feeling like this isn’t true. I think I would be happier in a world where major newspapers ceased to exist, compared to the world where they exist but their articles are paywalled. Take a second and check if you feel the same way. If so, what could be going on?
First, paywalled newspapers sometimes use a clickbait model, where they start by making you curious what’s in the article, then charge you to find out.
Here are some articles I’ve seen advertised recently (not all on paywalled sources): “Why Trump’s Fight With Obama Might Backfire”, “This Tech Guru Has Made A Shocking Prediction For 2020”, “Here’s Why Men are Pointing Loaded Guns At Their Dicks”.
I didn’t wake up this morning thinking “I wonder whether men are pointing loaded guns at their dicks, and, if so, why. I hope some enterprising journalist has investigated this question, and I will be happy to compensate her with money for satisfying this weird curiosity of mine.” No, instead, I was perfectly and innocently happy not knowing anything about this, right up until I read the name of that article at which point I became consumed with curiosity, ie a feeling that I will be unhappy until I know the answer. In this particular case it’s fine, because the offending website (VICE) is unpaywalled. I go there and after reading through nine paragraphs attacking “MAGA dolts”, in the tenth paragraph I get the one-sentence answer: there’s a meme in the gun community that any time someone posts a picture with their gun, amateurs will chime in with condescending advice about how they should be holding it more safely, so some people post pictures of them pointing loaded guns at their dicks in order to piss these people off. I feel completely unenlightened by knowing this. It has not brightened my day. It just removed the temporary itch of curiosity.
Some people critique capitalism by saying it creates new preferences that people have to spend money to satisfy. I haven’t noticed this being true in general – I only buy shoes when I need shoes, and I only buy Coke when I want Coke. But it seems absolutely on the mark regarding paywalled journalism. VICE created a new preference for me (the preference to know why some people point loaded guns at their dicks), then satisfied it. Overall I have neither gained nor lost utility. This seems different from providing me with a service.
They have an excuse, which is that this is how they make money. But what’s Marginal Revolution’s excuse? I saw this link in an MR links roundup. It was posted as “5. Why men are pointing loaded guns at their dicks.” So obviously I clicked on it, and here we are. But what is MR’s interest in making me click on a VICE article and read through nine paragraphs about “MAGA dolts”?
I can’t really blame them, because I did the same thing for years. I posted links posts, I framed the links in deliberately provocative ways, and then I felt good about myself when my stats page recorded that thousands of people had clicked on them. Sometimes I would write the whole thing out – “Here’s an article about men pointing loaded guns at their dicks – it’s because they want to criticize what they perceive as an excessive and condescending emphasis on trigger safety in gun culture” – and then nobody would click on it, and I would interpret that as a sign that I had failed in some way. I was an idiot, I apologize to all of you, and I have stopped doing that. I urge other bloggers to do the same – we gain no extra money, nor power, nor readership by being running-dogs for VICE’s weird ploy to trick people into reading its stupid articles. But as long as bloggers, Facebookers, tweeters, etc aren’t following good Internet hygiene, the very existence of paywalled sources will continue to be a net negative for the average Internet user.
This isn’t just about obvious clickbait like men pointing guns at their dicks. “Why Trump’s Fight With Obama Might Backfire” feels exactly the same to me. I don’t want to know more ephemeral garbage about Trump which may or may not affect his polls 0.5% for a week before they return to baseline. I don’t want to get more and more outraged until my ability to relate to my fellow human beings is shaped entirely by whether they’re a “MAGA dolt” or not. And yet I find myself curious what’s in the article!
(Trump’s fight with Obama might backfire because independents like Obama more than Trump, and the tech guru’s 2020 prediction was that Trump will lose. You’re welcome.)
Second, paywalled articles become part of the discourse.
Last week’s Wall Street Journal included an opinion column, Lockdowns Vs. The Vulnerable, arguing that statistics show the coronavirus lockdowns do not really prevent the coronavirus, but do disproportionately affect the most vulnerable people. It’s already gotten retweeted a few dozen times, including by some bluechecks with tens of thousands of followers.
Do you want to figure out exactly what statistics it uses and check whether they really show that lockdowns don’t prevent coronavirus? Too bad – the article is paywalled and you cannot read it without paying $19.50/month to the Wall Street Journal. I personally suspect that this article is terribly wrong, possibly to the point of idiocy. But I can neither convince others of this, nor correct my own potentially-false first impression, without paying the Wall Street Journal $19.50 a month. Which I don’t want to do. Partly because it is bad value, and partly because I don’t want to reward them for publishing false things.
Newspapers publish articles – factual and opinionated – intending them to enter the public square as a topic of discussion. But if the discussions in the public square have an entry fee, the public square becomes smaller and less diverse.
It also becomes more of an echo chamber. Probably conservatives subscribe to the Wall Street Journal and liberals subscribe to the New York Times. So if conservatives post articles from the Wall Street Journal, liberals can neither benefit from the true ones and change their own opinions, nor correct the false ones and change conservatives’ opinions. If you can’t even read the other side’s arguments, how can you be convinced by them?
Third, newspapers make it hard to guess whether you will encounter a paywall or not. Some of them raise a paywall on some kinds of articles but not others. Some of them raise a paywall if you’re linked in from social media, but not if you’re linked in from Google (or vice versa). Some of them raise a paywall if it’s your Xth article per month on a certain computer, but not before.
The end result is you can’t just learn to avoid the newspapers with paywalls. If you clearly knew which links were paywalled or not, you would just never click on those links, and not waste any time. Since any given newspaper has like a 25 – 50% chance of being paywalled whenever you read it, you get the variable reinforcement strategy that promotes frustrated addiction. And since at any given moment you are desperate to click on that link and find out Why Some Men Are Pointing Loaded Guns At Their Own Dicks, you will, like a chump, click it anyway, only to howl with rage when the paywall comes up.
This usually isn’t a deliberate misdeed; newspapers understandably want to people limited access so they can decide whether or not they want to subscribe. But some forms of this do seem deliberate to me. Like when they let you read the first two paragraphs and get emotionally invested in the story, and then surprise you with a paywall in the third (I think this is why you need nine paragraphs of filler before getting to the one-sentence curiosity-satisfier). Or when they wait five seconds before popping up a paywall message pops up, for the same reason.
Fourth, and most important, paywalled newspapers make it hard to search for information on Google. When I was trying to gather statistics on coronavirus to figure out how fast it was spreading, I noticed that the top ten or twenty relevant search results for a lot of coronavirus-related queries were paywalled articles. Because articles will make you wait several paragraphs/seconds before the paywall comes up, I couldn’t just quickly click on something, see if it had a paywall or not, and then move on to the next one. Instead, a search that would have taken me seconds if all paywalled sources ceased to exist ended up taking me several frustrating minutes.
There are some simple steps we can take to fix this.
First, search engines should give users an option to hide paywalled articles from results. I realize how big a shitstorm this will cause, I and I plan to enjoy every second of it. If they can’t make this happen for some reason, they should at least display a big red $$$ sign in front of paywalled articles, so users know which links will give them information before they waste a click on them. If Google refuses to do this, Bing should do it to get a leg up on Google. If both of them refuse, DuckDuckGo. If all three of them refuse, sounds like they’re providing an opening for some lucky entrepreneur.
Second, browser or browser-extension designers should figure out some way to automatically get links to display whether they’re paywalled or not. Maybe something like this already exists, but I can’t find it.
Third, bloggers (and social media users) should stop deliberately frustrating their readers. Stop posting tantalizing links like “Why Men Are Pointing Loaded Guns At Their Dicks” without further explanation! If you find the dick-gun phenomenon interesting, post the link plus a one-sentence summary. If someone wants more than the one-sentence summary, they can click the link, but I’ve done A/B testing on this and it never happens.
Fourth, bloggers (and social media users) should preferentially link non-paywalled sites. I realize this is not always possible, but most major stories are important enough that at least one non-paywalled outlet will be covering them.
Fifth, until the browser extension comes through, bloggers (and social media users) who do need to link a paywalled site should let readers know it’s paywalled. For example, Lockdowns Vs. The Vulnerable [PAYWALLED] or [$$$] Lockdowns Vs. The Vulnerable. This will save readers a click and hopefully make bloggers think about what they’re doing and whether it’s really necessary.
I’m making a commitment to do 3, 4, and 5 from now on. If I ever change this commitment, I’ll let you know. If you notice me slipping up, please point it out (nicely) and I’ll try to correct myself.
It’s June 5 in China as I type this late in the evening of June 4 in the U.S.
~ ~ ~
We’ve seen U.S. military personnel deployed to American cities which are not burning down and are not under siege; they’ve been deployed because Americans dared to exercise their First Amendment rights.
These are the same innate rights which founded this nation when colonists rebelled against the tyranny and oppression of an autocratic monarch, writing rebellious missives and tossing tea into Boston Harbor.
Troops and equipment were deployed on both coasts, to Washington D.C. and Los Angeles area.
Some of this military deployment was just plain stupid, sloppy, wasteful — flip-flopping resources from one place to another. I can’t imagine the military doing this; this is on Barr and Trump.
A federal riot team was dispatched to Miami for some reason. Perhaps it was because of Trump National Doral Miami golf course, or Mar-a-Lago, Trump National Golf Club Jupiter, and Trump International Golf Club West Palm Beach located an hour north. Perhaps it was because Miami-Dade County is only 15% non-Hispanic white and there would surely be protesting there. Maybe it was intended as an intimidation or voter suppression tactic which doesn’t appear to have occurred to Floridians.
With the news, a question hung in the air. Why Miami?
The answer is still shrouded in mystery, but the way the announcement was carried out has confused officials across different levels of government. Several law enforcement sources at both local and federal levels only learned about the team’s presence in Miami after reporters pointed them to statements from the Trump Administration.
Ultimately, the federal team is leaving Miami without being deployed.
Florida’s Gov. Ron DeSantis asked the National Guard to drop its work on COVID-19 support and take up patrol in Tampa because of protests there — but the protests have been relatively peaceful.
There’s also the hyper-militarized police which can barely be distinguished from military. This one is particularly puzzling since Walnut Creek, California is a relatively wealthy and relatively white part of the state.
“If you do not move, you will be dead!”
– Police in a military vehicle in Walnut Creek, California.
Some of my friends of Chinese heritage are disturbed by the comparison, suggesting Americans avoid it in no small part because many Chinese are still traumatized by the 1989 events. Others are concerned because China’s government still aggressively censors any mention of the 1989 protests, potentially removing users from social media. This is a serious punishment because all their identity, employment information, bill paying, credit scores are mediated through social media.
Other Chinese who don’t live in the mainland point to the comparison between 1989 and the US in 2020 and warn us not to end up like the Chinese — under an even more repressive state after hundreds of civilians’ deaths when the military put down the protests, squelching demands for a more democratic society.
It doesn’t seem possible that there could be more than a passing similarity between China in 1989 and the U.S. today, given the amount of freedoms many (straight white) Americans in this country possess.
We were reminded, though, the likely reason the military was called upon may have found inspiration in 1989.
Law enforcement deployed in/to DC we’ve seen/reported:
US Secret Service
US Park Police
Arlington PD (gone)
DC Nat’l Guard + other states
Bureau of Prisons
Pentagon Force Protection
Ft Bragg/Fort Drum active duty troops
So much energy and resources wasted because Trump has a ridiculously shallow concept of power and how best to use it.
But even more ridiculous than all this overkill intended to suppress Americans’ First Amendment right to exercise free speech through protest is the Republican Party’s hypocrisy, from Sen. Tom Cotton’s obnoxious op-ed in The New York Times calling for military deployment against Americans, to this feckless gem from the House GOP caucus:
Today, we remember the scores of innocent demonstrators killed by the Chinese Gov’t 31 years ago at Tiananmen Square for speaking out against the totalitarian regime.
We must hold the CCP accountable for suppressing freedom & for their malign activity that continues today. pic.twitter.com/ncPH2M33Je
Utterly blind to their double standard — a president who uses the military to suppress constitutionally-protected speech in violation of his own oath of office is okay with them, but they threaten a totalitarian government which also suppressed speech with military force?
At least the Chinese show signs of breaking their suppression — in spite of attacks on Hong Kong’s freedoms — after their government’s initial handling of the COVID-19 pandemic cost the country valuable time to stop the disease from ravaging Wuhan’s population.
Free speech would have saved Chinese lives; it would have prevented President Xi Jinping’s and the Chinese Communist Party‘s loss of credibility caused by suppressing Dr. Li Wenliang’s warning about COVID-19
Somehow I doubt Trump will learn anything at all from China’s failure.
He certainly doesn’t seem able to learn from his own.
~ ~ ~
It’s now June 5 here in the U.S. as I finish typing this.
31 years ago, a lone man carrying bags in his hands as if he had just been shopping, stood in front of a line of tanks impeding their procession. The Chinese military had fired upon protesters, killing as many as 500 people in Tiananmen Square during the previous two days in an effort to put down the pro-democracy movement.
For a moment in time one man stood between the regime and an oppressive future.
I’d like to think there are more than one or two persons willing to stand up to systemic abuses and repression here, hold it in check longer than a moment in time.
The protesters in the streets over the last 10 days tell us there are.
The polls in November will tell us if there are enough.
What will our children say of this time in 31 years? What will they remember of us?
This week, as protesters have taken to the streets to demand justice for George Floyd, Breonna Taylor, and countless other Black people murdered at the hands of the police, local bail funds have been inundated with donations. One of my favorite tweets calling for people to take action:
An earlier version of that sentiment appealed to the science nerds out there:
If you don’t know the story of HeLa cells, here’s the cliffs notes version, detailed in Rebecca Skloot’s excellent book The Immortal Life of Henrietta Lacks: in 1951, Henrietta Lacks, a black woman, went to the Johns Hopkins medical center for cervical cancer treatment. Researchers took a biopsy from a tumor and discovered that her cells were unusually hearty, so they began culturing those cells and using them in medical experiments.
Soon, they sent and sold cells to other scientists, and now HeLa cells are among the most popular cell lines used by researchers. Lacks’s cells were used to test the first polio vaccine, showed scientists that humans have 23 chromosomal pairs and not 24, and have even been sent to space. Science owes a great deal to Henrietta Lacks and her family. But Lacks’s family wasn’t ever told their relative’s cells were being used, nor have they seen any money from that exploitation.
What happened to Lacks shows the many ways in which our society is built on Black people’s bodies and labor, and how this history often remains invisible because powerful people and institutions have a vested interest in ignoring it. To turn away from this obscures the truths science aims to uncover, and yet all too often, science writers or scientists shy away from discussing race and politics in their work. But it’s not possible to compartmentalize race and politics; they are inherently a part of science, and always have been. Any work that ignores anti-Blackness in our society and, especially, in science, is lacking. (And that’s aside from the morality of it all; as a Huffington Post article once put it, “I don’t know how to explain to you that you should care about other people.”)
What’s happening in the US right now only underscores how deeply race and politics are intertwined with science and nature. This pandemic disproportionately kills Black and brown people. Public health experts are speaking out about white supremacy as a public health issue that shortens Black people’s lives. And Christian Cooper’s encounter with a hysterical white woman who called the cops on him exposed how Black birders may be treated while engaging with nature (and led to Black Birders Week). Black lives matter, and it’s clear we need to keep working at building systems in science and nature that actually reflect that.
Say I wanted to make an app that recognizes avocado toast, and I have an AI model ready to go. All I need to do is feed it some pictures of avocado toast.
So the next time I’m out for brunch, I order avocado toast and take a picture:
One image in the bag! And if I label it “avocado toast” and send it to a human, they’ll know what avocado toast is and be able to recognize it. But machine learning algorithms need more than one picture in order to abstract the features that make this avocado toast and not some other kind of toast.
So next I go online and search for creative commons images of avocado toast:
That’s four down, ten thousand to go. Well, at least it’d be ten thousand if I wanted to have a highly accurate avocado toast recognizer using the industry standard of just throwing more and more data at it until it works.
But if I’m ok with having an AI that is only, say, 80% accurate, can I get the number of images down to a manageable level? If I’m taking the pictures myself, can I need even fewer images if I’m smart about how I take them? Maybe instead of spending time collecting thousands of pictures of avocado toast, some of that time would be better spent strategizing.
How much avocado toast does an AI really need?
1. More Avocado Toast with Avocado Toast Photoshoots
Maybe I notice all the above images are at a similar angle from above, and with one complete slice of toast just barely within the frame. They are all beautiful artistic photos. But I don’t want my toast recognizer to depend on the number of slices or how they are framed, and I don’t want it to fail in real world situations with unattractive toasts:
Avocado toast is avocado toast no matter the angle, no matter how many, and no matter whether you’d proudly share a picture of it. So as long as I’m taking photos, why not take a few, and let them be ugly?
In fact, I can take pictures all throughout the process, changing angles and lighting and background, to turn one snack into a photoshoot that collects 20 images for my dataset:
These 20 images might not be worth as much as 20 independent toast images would, but it’s so much easier than collecting even just two entirely unconnected toast images.
So how much more are these 20 images worth than 1? Was it worth my time to re-plate the toast onto different backgrounds? To nudge the slices around? Should I have taken even more variations, or did I reach the point of diminishing returns after the first few? Is this a good strategy at all?
I can make some guesses based on existing research, but there’s nothing like trying it yourself. So I decided to keep on gathering multi-image datasets on a single toast-making process. Here’s one for Blue Cheese Apple Honey Toast to add to my Not Avocado Toast dataset:
This time I took slightly fewer pictures. I figured I didn’t need so many variations on plain toast, since every toast I make will start with plain toast. I’m not sure how much I should worry about over-representing plain toast in my Not Avocado Toast dataset,1 but I’ll stick with my intuition for now.
2. Data Augmentation and Friends
There’s a well known technique in machine learning to turn one image into many images for your dataset, called data augmentation, where you make copies of an image and modify it with various filters. In the case of toast we should be able to flip or rotate the image, crop it a little, blur it a little, or shift the color slightly:
In fact, our toast recognizer AI already has this built into its training. It’s an easy way to turn my dataset of 100 pictures into 1000, which would be a respectable number if 90% of it weren’t almost the same as the first 10%. And of course there’s the fact that most of that 10% is repetitive too, being different pictures of the same toasts or toast-in-progress.
For automatic data augmentation you have to be careful not to do anything to the images that might make them unrecognizable, like shifting the hue too much or cropping out the important part:
In cases where more data is hard to come by but very valuable, sometimes it’s worth it to have people make variations by hand.2 This allows a greater range of image variations without accidentally creating bad data. For my AI that is specialized for toast and toast only, I expect it to work on the following hand-made edits:
For other applications I may want only whole-toast images, or only expect a certain range of sharpness or blurriness, or expect certain lighting. For toast, you can reverse the image to get another variation, but I wouldn’t include that in my augmentations if I wanted my AI to recognize letters of the alphabet.
The great thing about data augmentation is that there’s a bunch of research on it.3
We know data augmentation is useful, though artificially augmented images aren’t anywhere near as good as real new independent images. We should expect the similar images from our photoshoots to fall somewhere between the two in effectiveness. I’d also be curious whether it’s worth it to bother doing artificial data augmentation when you already have natural variations from a photoshoot. For a dataset this small you might as well, but for large datasets it might not be worth the extra computation.
3. Avocado Toast Alignment Space
Maybe with the right strategy I could improve my AI with a small set of carefully chosen Avocado Toasts that together represent all of their brethren. I like to imagine giving an AI a full sense of the world of possible avocado toasts just by feeding it the avocado toast archetypes. I won’t try my hand at creating the toast version of the Jungian Archetypes, but I will absolutely 100% create a Dungeons and Dragons-style alignment chart:
The Dungeons and Dragons alignment chart, which categorizes characters along the two dimensions of good/evil and lawful/chaotic, is not entirely dissimilar to how our AI works.4 During training, our AI tries to pick up on what features are important to categorizing toast. In its first layer, it only asks “how much green is in pixel #345? How much red is in pixel #9882?” etc etc. But we can hope that in later layers, as it looks at pixels in groups, it will start judging pictures with questions like “how much smooth green curve is in this picture?” and “do parts of it have a mushy green/yellow texture?”
Most likely, especially as our AI is way overpowered for our tiny dataset, it is not asking very reasonable and obvious questions about how much green is on the toast but is instead doing something more akin to conspiracy theory logic. “Yes I know that when you look at the toast, you see that it is covered in green mush,” the AI might say, “but actually it has 6 brown pixels in a pattern that looks kind of like the letter h if you squint in this precise computational manner. And look here, on this cinnamon toast! It has a very similar pattern of 6 brown pixels! And so does this peanut butter toast! Now, check me if I’m wrong, but I’m pretty sure that cinnamon toast and peanut butter toast ARE NOT AVOCADO TOAST. So I’m 100% sure this isn’t avocado toast either, and your feelings about seeing green mush don’t change the facts. I am very good at logic and math. You can count the pixels yourself if you don’t believe me, go ahead, I’ll wait.”
With enough data, we can hope the random coincidences like those 6 brown pixels will become overwhelmed by better choices in features. The common wisdom in the field is that we should trust the AI to figure out what the best features are rather than directly telling it to look for the green stuff. Even if it initially goes astray it will set itself right once it gets a broader range of toast experience. But when you’re working with a small dataset it sure would be nice to poke it in the right direction! Especially as photos with similarities, whether through automatic data augmentation or a photoshoot of different angles and variations, are more likely to have irrelevant features in common that the AI will pick up on and think are important.
I’d love to be able to annotate by hand what I think some of the important features are, and then let the AI relax into an optimized version of that.
In the above picture, I annotated Crust as “irrelevant” because the presence of crust doesn’t help tell you whether it’s Avocado Toast or Not Avocado Toast. But the AI might use crust recognition to help in other ways, such as identifying what parts are the toast and what are background. Crust can also indicate the scale of the toast within the image, which might help with identifying avocado slices and mashed avocado texture at the correct scale, as some pictures will be taken from closer or further away.
I know with my human brain that what makes Avocado Toast Avocado Toast is the avocado, which comes either in slices or mush, and it is fairly recognizable it because it is green. But Not Avocado Toast can also have green, usually due to leafy greens such as arugula, so we have to look at the features that differentiate a leaf from either mushed or sliced avocado. To make a direct comparison to the Dungeons and Dragons alignment chart, let’s replace Order vs Chaos with Hard Edges vs Mushy, and Good vs Evil with Dark Green vs Yellow Green:
For some kinds of AI, the goal is to find a really effective alignment chart that you can put all the images into, and then calculate what line to cut along to separate the categories.
While 9 archetypical images and just two dimensions of alignment might be unrealistic, I do think data is worth more than people think it is. The current expectation is that AI companies can’t function without huge amounts of data, and therefore the companies must be huge and must collect all the data from their users if their services are to work at all. This exchange of data for free services is viewed by most users as fair or necessary, and only the huge companies have the compute power to work with that much data. Things might be different if AI companies had an incentive to pursue smaller AI and curated datasets.
4. Is It Avocado Toast?
I know my bespoke avocado toast recognizer will never compete with corporate-created models trained on a million images. In fact, since the only data it has seen is of toast, it doesn’t know the difference between toast and literally anything else. It assumes everything is toast, just either avocado toast or a different kind.
But now that the AI is fed and trained, we can look through its eyes to see how it categorizes non-toast objects, according to the standards of toast.
Sorry, I don’t make the rules.
These are all the actual results of the AI I created.
(More on this in Appendix 5.)
To take a brief philosophical detour, sometimes I wonder if all of us are kind of like the AI that only knows how to see things as avocado toast or not. Humans are prone to judging things with high confidence based on the standards they’ve learned for the things within their experience, while completely not seeing the truth of what something actually is because it’s outside their experience.
5. I Fed Avocado Toast to an AI and Here’s What Happened
I fed avocado toast to an AI and you’ve seen what happened… to the AI. But what happened to me?
I knew that if I wanted to keep improving my AI, I’d need images that were more independent from each other than the ones I’d been taking. I’d been basically making the same kind of avocado toast over and over, as seen in these six different days in September 2019:
Delicious amazing heirloom tomatoes were in season, and I always use sliced avocado, rather than mashed avocado. What’s the point of mashing, unless you’re going to add other ingredients?
If I’m going to bother to mash, I’m going to step up my game. I might as well try to make the full range of possible avocado toasts. That’s when I thought of toast archetypes, and asked myself: “What would chaotic good avocado toast look like?”
In general the dataset I was building gave me that extra push I needed to get off my butt and make a food rather than eating something packaged or working through meals. And it encouraged me to expand my avocado toast palate to include a variety of different vegetables and proteins.
Workers in our field are notorious for getting overly focused with intense long hours that lead to unhealthy eating habits,5 and it’s not uncommon for folks to think of the body’s need for food as a nuisance and a liability rather than something to enjoy or spend time on. I feel a lot better when I eat something fresh and delicious. I know I should and that it’s worth the time, but it sure is easy to slip into bad habits.
Having these toast photoshoots be technically work made it easier to justify the time and interruption to eat a good food whenever I want one instead of growing slowly and subtly hangrier while pushing through work hours. And now that I’m in the habit of picking up the ingredients every week and making the recipes, it is much easier to do these things automatically without it feeling like an extra cognitive load.
So if you’ve got a snack or recipe you think you’d benefit from spending more time with, consider making an AI that needs to eat it. In this field we tend to treat our AI better than we treat ourselves, and apparently SOME of us are willing to cook to feed an AI when we wouldn’t cook just for ourselves. I could probably extend this lesson to other areas of my life.
Basically, while feeding avocado toast to an AI provided a lovely source of short-term amusement, this process will have long-term effects on me. Look at this toast now in my food repertoire:
I am forever changed.
This project also positively affected my social life, as I shared my toasts with colleagues who knew about the project and even received from them dozens more toasts to add to my dataset. For much of my friends and family my work can sound a bit abstract and difficult to connect to, but here was something where they could share tips and recipes, and enjoy the results. Using AI to strengthen human social bonds seems preferable to using AI to replace them.
Finally, this project also fulfilled its intended purpose of inspiring directions for research I wouldn’t have thought of without hands-on experience making and using a dataset. I mean, this AI may not be impressive, or useful, or… good. But I love it, and it is mine, it helped me explore a bunch of ideas that I think will come in handy later.
THE END, Kinda
Thus ends the general audience part of the post, thank you for reading!
For a slightly more technical or AI-invested audience, you are most welcome to read on.
Appendix 1: The “Coupon Collector Problem” as applied to the Value of Data
It is generally known that adding more data increases the effectiveness of your model in a logarithmic way,6 with the initial images adding a ton of value and then diminishing returns as your model approaches 100% accuracy.
How much of this depends on the assumption that you’re adding random new data, where each new piece of data is increasingly likely to have a lot in common with ones that came before? Maybe that’s kind of like how I kept adding avocado toasts to my dataset, but after 70 avocado toast pictures had still only filled out 5 parts of my alignment chart. It wasn’t until I stopped and thought about it that I realized my dataset was completely missing chaotic good toast and evil avocado toast, and instead of continuing on randomly for another 70 toasts (or more) I only needed to make 4 more toasts to complete the set.
Maybe with the right AI architecture, data would be worth more because it would describe these lower dimensional feature spaces more quickly, rather than needing to train over a ton of data just to get rid of the noise. Or maybe if we were more careful about how we collected data, rather than adding random data and hoping we get what we need.
It’s like the Coupon Collector Problem, a classic in probability. It’s named after collecting a complete set of coupons from cereal boxes, and also applies to collecting a complete set of trading cards from random packs, or a complete set of blind-bag toys. This problem and many variations have been well studied, and guess what kind of curve you get as you collect more samples in search of a complete set:
That the Coupon Collector Problem and AI data problem both involve collecting samples in search of completeness, and that both result in logarithmic behavior, may be a purely metaphorical connection. One key difference is that, at least in the original formulation, the number of collectible items is finite and thus we do expect to reach 100%, while there’s basically infinite possible toasts and if we managed to catalogue them all you can bet that by then someone will have invented a new kind of toast.
I’ll leave this here as a suggestion for research, as it sure would be nice if we could do for AI the equivalent of seeing through the boxes to collect what we need rather than buying an exponentially increasing number of mystery items. We would also get much more interesting market behavior for data if we could tell the rare items from the common ones, rather than our data budget being overwhelmed by the cost of buying piles of near-duplicates in a completely nontransparent way.
For example, the Black Lotus card in Magic The Gathering is famously expensive both because it is a rare collectible and because it is a powerful card with value in high-stakes tournaments.7 What if there’s an equivalent toast image that would make any toast AI more competitive?
Imagine a collectible card game with infinite unique cards, but many are similar to each other or complement each other in particular ways, and you need to collect and build a 10000-card deck to compete in tournaments with million-dollar prizes. But you can’t buy the ones you need because there’s no market for selling cards, no one has developed a deck-building strategy, and no one has bothered to figure out how to price individual cards. And no one bothers to fix those things because why buy cards when most people will just give you their spare cards for free because they don’t have enough for a full deck anyway. A Black Lotus becomes worth just as much as anything else, because the market doesn’t have the mechanisms it needs to know better.
Appendix 2: A Cautionary Tale
When I started training my model on photoshoots of many related images, I expected the accuracy to go up as compared to my previous model that used a smaller set of unique individual toasts. The problem was that the model soon started showing over 99% accuracy. This had to be wrong. The dataset was too small, and with some testing I found that my Avocado Toast AI performed exactly as poorly as I expected it to in real world situations. Meanwhile, tried-and-true standard methods of testing accuracy were telling me I could put the “99% accurate” stamp of approval on my product.
My model used the standard method of a train/test split, where the images are randomly sorted into two groups. One group gets used to train the AI, and then after it’s trained you test it on the images it has never seen before. It wouldn’t be fair to test an AI on the same data it was trained on, because the model could basically “memorize” what the correct answers are for those specific images.
My guess for what went wrong is that, by feeding my AI related images, it became very likely that some very similar pictures of the same exact toast would end up being trained on and then tested. This lets the AI fake its way through categorizing the images in the test image set, creating a higher score than it would get with truly independent toast images. The architecture knew not to include its own automatic data augmentation images in the test set, but it didn’t know there were near-duplicates in the original dataset.
This made me wonder how often this happens with commercial AI models, where large datasets and powerful models would make a high score look like a good thing rather than a red flag. If there’s duplicates in the dataset, or if some of the images are correlated, it might trick engineers into thinking their model is performing better than it actually will when put in a real world situation. This seems especially likely when data is collected through web scraping or crowdsourcing, and it becomes more likely the more data you collect.
This is part of why de-duplification is a standard part of data cleanup, but usually that is more about removing exact duplicates or near-indistinguishable duplicates. De-duplification might not catch similar images like the ones in my toast photoshoots, and there’s a certain art to how similar counts as the same depending on the application.
As I added more photoshoot sets, and expanded the variety of toasts I was making, the accuracy went down again. It can always be a bit disheartening to see that adding more data makes your model apparently perform worse, but I knew that in reality the model was becoming better. Focusing too much on the numbers disincentivizes adding new varieties of toast that widen the scope of what the model sees during training and testing, but sometimes you have to use your own good judgement rather than going by what the numbers say. The only test that matters is the real world.
Appendix 3: Train/Test splits, correlated data, and lumpy datasets
For my toast dataset, when I create the train/test split I don’t want to split images from the same photoshoot across training and testing. It might be better to have all the images from each toast-making photoshoot go into either one category or the other. For small hand-crafted datasets like mine it’s not too hard to group them by hand. In general I could see using metadata like timestamp and location to automatically group images into correlated subsets.
Data could also be flagged for similarity through an automatic process, with human review to see whether it’s a true duplicate or meaningfully independent. For example, two independent images that contain the same book might be flagged as containing the same thing based on the identical cover, while two identical toasts must be the same toast (or perhaps different copies of a book with toast on the cover?). The algorithm does not know or care that in real world situations you might see duplicate books but every toast is unique, so it’s up to us to design tests that reflect reality.
Or in my case, I’m using the same plate and cutting board in multiple photoshoots. As long as I use it for both Avocado Toasts and Not Avocado Toasts, I can hope that my model won’t be memorizing what my plates look like and using plate features to guess what toast is on it, but the thin stripes on my plate seem very easy for this kind of AI to recognize. If 60% of the times this plate is used on avocado toast and 40% on not avocado toast, my model might take that into account.
Is it better to pay more attention and try to get all my numbers even, to standardize how I make my variations? Or maybe those little statistics really do add up to have true predictive power of the sort that AI is really good at calculating. For example, maybe I should standardize how I take my lighting variations. I ate a lot of avocado toast outside during the summer, so those are the only ones with direct sunlight, and maybe I don’t want my AI to be fooled into thinking the harder edges and high contrast of direct sunlight is a distinctive feature of avocado toast. But maybe it should! Maybe direct sunlight is correlated with the particular heirloom tomatoes that were in season, which taste very good with avocado, and that little statistical correlation combined with some unconscious human predisposition to putting avocado toast on red plates plus many other correlations all add up to a strong prediction despite none of it having anything to do with recognizing the shape and color of avocado. Maybe this will remain true in general, even when testing on images I didn’t create.
It might be best to not test on images that were created specifically for ML model creation at all. If I test on a whole photoshoot image set, those results will be correlated with each other. If I pick a random representative from the set, maybe I’ll get the fully formed toast at its most divergent, but maybe I’ll get the plain slice from earlier in the shoot. Since all my photoshoots start with plain toast I have a lot more data on that sort of toast and I’d expect the model to perform better than it would with any random toast. Maybe I should pick the last image, or a random image from the last 5 images. But I do want to test on plain toast sometimes, too! But would hand-picking my test set based on my intuition lead to more accurate results, or simply introduce bias?
Most AI research uses one of a few standard datasets with standard train/test splits so that the performance of the algorithms can be compared fairly. This is helpful when reading research, though it’s possible to get stuck on the details of what performs better on those particular datasets, and when you’re comparing performances over 98% I worry the whole field is overfitting their techniques to a few particular datasets and the style of those datasets. These standard datasets are artificially ideal for the exact algorithms people are trying to make incremental improvements on. They have the same amount of data in each category, it’s all cleaned and uncorrelated and evenly spaced, and photos follow human aesthetic norms for photos they would share in a social context.8 I think the biggest room for improvement is on the human side of data creation and collection, but we’ll need to change our methods in order to see those improvements.
For the right type of AI and with the right data structuring, more data should in theory always be better. But there’s a myriad of ways that more data can mess with your results if you’re not careful, and sometimes these pitfalls are only discovered through luck after they have already happened.
While I was working on this post Jabrils published a video9 that demonstrates how a dataset with multiple similar images of pokemon can decrease the quality of results when creating images using an autoencoder. My intuition is that this makes sense because the multiple similar images weight the autoencoder to value features that are almost identical between different headshots of the same pokemon, with less emphasis on the features shared between different unique pokemon. He got better results by deleting most of his dataset, which was counterintuitive to many commenters on the video, but made perfect sense to me having been working on this problem for a while.
Appendix 4: Addressing Ethical Concerns about Plain White Toast Bias
Perhaps I should be more concerned that having plain toast over-represented in my dataset will make my AI model focus on plain toast to the detriment of other toasts.
With so many examples of plain toast in both the training and test sets, my AI might be incentivized to specialize in plain white toast accuracy and still get a decent score, because plain toast shares plain toast features. Learning those features would be easier than trying to learn how to recognize all the different features of toast-with-stuff-on-it. Our AI might decide toast-with-stuff-on-it is too diverse to bother accommodating:
Plain white toast might be more common in the real world than other kinds of toast, so it’s not necessarily a bad thing if our AI is biased toward it. The reason we have more plain white toast data in the first place is because I’m taking real photos of the toast in front of me, and there’s simply more plain toast, so our dataset reflects reality. And if the texture of plain white bread is easier to recognize, well, I don’t have a problem if our AI has the potential to perform extra well on plain toast for purely mathematical reasons. So it’s probably fine.
For another example, you may have seen the iconic Google Deep Dream images where everything is dogs. Why is everything dogs, when this AI is trained on so many other things? And in fact, the dataset it uses10 follows the good-practice standard of having every category of thing have the same amount of photos. I’m not quoting exact numbers, but it’s something like:
1,000 photos of birds
1,000 photos of cats
1,000 photos of trees
1,000 photos of dachshunds
1,000 photos of golden retrievers
1,000 photos of houses
1,000 photos of border collies
1,000 photos of pugs
…and so on
This is a tricky thing. Socially and linguistically, we do differentiate between different kinds of dogs. It would make sense for our AI to look at a dachshund and label it “dachshund”, look at a cat and label it “cat”, look at a bird and label it “bird”. One’s a breed, one’s a species, and one’s a whole entire taxonomical class, but we end up with an AI that is 1% bird and 10% dog. The AI is trained to think dog-like features are ten times as important as anything else ever.11
When I made my Avocado Toast AI, I made a choice to train on two categories: Avocado Toast and Not Avocado Toast. I wanted an AI that was 50% avocado toast, not one that was 10% avocado toast, 10% plain toast, 10% peanut butter toast, etc. My goal with this project is not to represent reality, my goal is to feed avocado toast to an AI and learn from the experience.
As far as I know, standard practice to solve these kinds of problems for actual image recognition applications is to do sub-categories separately: have an AI trained on 1% dogs and 1% cats, and if you get dog then run it through another algorithm that specializes in dogs. And perhaps this is the third step, after first categorizing it as an animal rather than a plant or object. This doesn’t work for all kinds of AI and applications, and it requires having a strict taxonomy where all things at each level are treated as equal, which doesn’t reflect reality. You have to choose which dogs will represent all of canine kind in your reduced dataset, and whether avocado toast is a top-level category or a subcategory of Not Plain Toast.
Point is, the world is full of such algorithms making real decisions that impact people’s lives, trained on the equivalent of too many dogs or too much plain white toast.
If I’m creating my own ML model for custom data rather than a standard set then I have to make choices about what makes sense given the type of data I’m using and how it is created, and also how it is meant to be used. Going with an existing AI model is a choice too, and it is by no means an impartial one. All we can do is do our best, and keep an eye on the results rather than trusting the algorithm. Sometimes it involves better fitting the algorithm to the dataset, and sometimes it’s about changing the dataset to fit the algorithm, and in the best case you can fit both to each other.
I sometimes hear concerns that meddling with the model or data creates room for human bias to sneak in, similar to concerns about p-hacking or data dredging.12 This is a good thing to watch out for, but it’s often used as an excuse not to try. It doesn’t matter if you stop more bias from sneaking in, the bias is already inside the house. Machine learning is useful exactly because real data is lumpy, clustery, and correlated, rather than a perfect even grid of points in N dimensions. The world did not come pre-defined with everything sorted into standard categories of standardized size. We’re making and breaking categories all the time, and sometimes the way we label things (or people) has far-reaching effects.
Fortunately for me, there are specialists who can help with best practices, audit algorithms, and consult on datasets and collection. If you’re involved with an AI that actually affects people’s lives, make sure you’ve got someone on the job who has expertise in these matters.13 Some folks see AI ethicists as there to be some bar to pass just before (or even after) product release, for legal or PR reasons, but if you let them they can help you make sure that at a fundamental technical level your thing is doing the thing you want it to do rather than filling the universe with dogs. That’s better to find out sooner rather than later.
Even with the best experts and best intentions, it’s still essential to keep an eye on how a model performs in the real world and whether what it’s doing makes sense. No amount of data can simulate the real world, all models are flawed to some degree, and as with all things you just have to be open to recognizing your inevitable failures and doing your best to make things right.
We also need to stay aware of when trying to implement a particular AI might not be worth the societal cost. There may be cases where training a high-performing AI requires labeling and categorizing people beyond what people are ok with, and even if the resulting AI does good things, the ends might not justify the means.
As for me, I’ll stick to toast.
Appendix 5: Is Baby Yoda REALLY Avocado Toast?
I wasn’t lying when I said in section 4 that those were the real labels my AI applied to those photos. But I did have a narrative arc I wanted to create in this article, and I did what I had to do to get that story.
Obviously the AI has no experience with anything besides toast. I thought Baby Yoda was a good candidate to be judged as Avocado Toast because he’s green and has ears reminiscent of an avocado slice, though I figured that given the small size of my dataset and the lack of training on non-toast things, it would be basically random. The first image I tried was unfortunately judged to be Not Avocado Toast:
I printed it out and grabbed some colored flashlights so that I could tweak the image by hand in real time using the AI’s webcam interface. I thought maybe a greener tint would work. In fact, it was a strong red tint that changed the categorization:
I played around with a red and green flashlight to try and get Baby Yoda to appear with his natural skintone while also adding enough red to keep him categorized as Avocado Toast.
I tried a bunch of other pictures, angles, crops, etc, and found that most of them wanted to be Not Avocado Toast at first, but could be nudged into Avocado Toast:
Buckbeak and Samwise were even more difficult. After several failures I cherrypicked some photos with green backgrounds in the hopes that they would be easier to nudge into the categorization I wanted, which finally worked.
So yes, perhaps I cheated a bit to get these characters to be Avocado Toast. But it wasn’t quite as cheatery as I intended, because for many other characters I could not get them to register as avocado toast no matter what I did.14 Severus Snape is stubbornly not avocado toast, no matter what photo and variation I tried. He just isn’t avocado toast, and that makes Snape and Baby Yoda fundamentally different.
So the truth is, Baby Yoda really is more Avocado Toast than most. I could get any photo I tried of Baby Yoda to categorize as Avocado Toast with some variation or another, which wasn’t the case for any of the other dozen characters I looked at. Though if this weren’t the case I still would have written the main part of this article the way I did with no regrets; only this appendix would’ve been a little different. Something to keep in mind when you’re out there reading pop tech articles about AI!
Conclusion and One Last Plea for References, plus a weird rant about dolls
I’m sure folks are working on things related to all the above topics, please send me references! And if you’re one of these folks, would love to hear your take. I hope this articulation of the problem helps spark some new ideas on the topic, and that you’ll look at toast with new appreciation.
I want a world where individuals and small groups can create their own AI using their own data, and so it’s vital to get the best performance possible out of smaller sets of correlated data. In order to leverage AI technology to increase human potential we shouldn’t have to rely on a few giant corporations, widespread privacy violation, and wastefully huge computations that require awesome amounts of computational power as well as energy use.
I want a world where we respect individuals’ rights, including their rights of privacy and property. Currently, tech companies feel that in order to be competitive they must push the boundaries of privacy and undervalue individual contributions, otherwise it is impossible to get enough data to make their technology economically viable. Perhaps with better techniques we can respect people’s data privacy as well as unleash the economic potential of data labor.15
Most of all I want a world where we use technology as a tool to boost our humanity, not one where technology eats away at us as we struggle to meet what it demands of us, while simultaneously making excuses for the harm it causes. As much as I anthropomorphize my avocado toast AI in this post, it’s no more than a bit of simple math.16 Any needs or desires it has are just a reflection of my own.
Playing with AI is like playing with dolls, it’s a chunk of myself I’ve externalized and personified as “other” in order to experiment with my perspective and my desires. I accept my Avocado AI as my own self-expression in algorithmic form, and I take full responsibility for its lust for cholesterol. I have no intention of cutting off that piece of myself and pretending it’s outside my control.
I wish the rest of the industry would catch up, because it’s kind of creepy, like, imagine you go to the bank and a full-grown adult is playing with a barbie doll and she says to you in a squeaky voice “You can’t have a loan! You’re not good enough!” and then she says to you in her normal voice “Oh, I’m so sorry, I wish I could help you buy that house, but Bank Manager Barbie has the final word on this matter. She’s an expert.” And then there’s a giant press event where she does this in front of the entire tech industry with no self-awareness whatsoever, and everyone eats it up. 17
Anyway, my avocado toast AI is good and I like it, the end.