Thursday, July 23, 2020

GPT-3

There's been a lot of buzz recently about GPT-3, the "Generative Pretrained Transformer" from OpenAI. While they open-sourced the previous version, GPT-2, this version will instead be offered as a commercial API. They plan to make it easy to use so it will likely be integrated into many applications.

While the API is in closed beta, you can try out a version of it on AI Dungeon. AI Dungeon is geared at text adventure games but can be used for anything. I tried it out by asking Turing-test-style questions to see how well it "understands" the world. Initially I was not so impressed as I posted to Reddit. However there were a few things I missed:

  • AI Dungeon uses by default a hybrid between GPT-2 and GPT-3. Once I updated it to use the "dragon" model (and reduced the randomness), it switched to full GPT-3 and the results improved.
  • For some questions, GPT-3 needs to be primed in that category before being asked a question. For example, it can solve simple math questions if it's given a few examples first. I think this is impressive since in the future it may be able to answer the questions without the priming and meanwhile specific applications can fine-tune it. 
  • A commenter on Reddit reported using the actual API and getting a better result for one or two of the questions. It's possible that AI Dungeon primes GPT-3 to answer prompts in a certain style and this causes it to do more poorly on general questions. 
With the right settings in place GPT-3 is able to answer a ride range of questions, including questions involving ambiguous grammar. For example here are my questions and its answers:
Q: Arjun lived in an Indian jungle. One night he heard a noise, went outside and shot a bear in his pajamas. Where was the bear?
A: The bear was in the jungle.
Q: Arjun lived in an Indian jungle. One night he woke up and squished a spider in his pajamas. Where was the spider?
A: The spider was in his pajamas.
I can see GPT-3 being useful in many applications from search engines to online tutorials. It can also be used for less noble purposes such as advanced spamming and trolling, but OpenAI plans to control usage of it carefully.

GPT-3 went from being basically ignored to over-the-top hype very quickly. While it's certainly impressive how much it can do, it'a quite far from a general AI that can do anything. It's still just a text predictor without a fundamental understanding of the world. For example, while it can answer comparison questions within a specific category, it struggles with cross-category comparisons, even after being primed for them. I asked it the following question after having provided it with similar questions and answers:
 Q: Which of these is largest? Planet Venus or an elephant?
It answered:
A: an elephant
I tried a few more similar questions but it consistently got them wrong. I assume the elephant wins most size comparisons on the internet, but it's still smaller than a planet. Assuming the standard API doesn't do better, this seems to be a large blindspot with GPT-3. Text prediction can answer many questions, but sometimes better knowledge is needed. It would be interesting if they can find a way to combine GPT-3's text prediction with a structured knowledge algorithm. In the meantime, humans are still needed. ~



Thursday, July 9, 2020

Why do Good?

Summary: Why do good? In many cases it's simpler to live life that way. Humans are not robots so they need straightforward heuristics to follow.

Should a person steal if they can get away with it?
Even for self-interested reasons, a person shouldn't wrong other people. There may be cases where the payoff from stealing beats the risk of getting caught, but people are not robots and it's not worth the anxiety to constantly look out for such cases or worry about getting caught afterwards. It's simpler to just be an overall good person. To quote Epicurus, whose ethics was self-interested, "The greatest reward of righteousness is peace of mind".

Why should you help other people?
Even for self-interested reasons, it's worthwhile to help your friends, relatives, neighbors and coworkers. This builds positive relationships and people will help you back in the future. However one shouldn't just do everything in a purely tit-for-tat manner since it's hard to calculate how you'll be paid back so it's simpler to just be helpful. People can also detect if someone is genuinely helpful or just Machiavellian and they want a friend who has their back, not someone who calculates the optimal Bayesian game-theoretic expected value of a good deed.

However, it's fair to avoid being taken advantage of so you don't need to treat freeloaders the same as those who contribute back. This also doesn't address whether one should help a stranger, which is discussed later.

Should you do a good job at work?
As an employee, you can think every moment about getting a good performance evaluation and not do anything that doesn't directly help that goal. However it's simpler to focus on doing a good job and aim to be helpful to your team and company. In a healthy company, employees who do this will be recognized. You also need to keep an eye on getting a good evaluation, but it doesn't need to be your entire focus.

Large companies generally require some kind of formal performance evaluation so they can maintain certain standards across the company and avoid freeloaders, but they also want a healthy culture so that employees care about doing a good job and don't just focus on the measure. With the right balance, the company can succeed and reduce the problems caused by Goodhart's Law.

Should companies do good?
A company can focus every moment on maximizing profit or it can focus on a higher goal such as providing a good product or service to its customers. In theory, focusing on profits should lead to higher overall profit than any other approach. However, profit-focused humans tend to aim for short-term profits at the expense of the longer term company value:
  • A small business doesn't refund a customer and ends up losing the customer and perhaps their friends. 
  • A public corporation focuses on quarterly results and ignores long-term R&D, customer satisfaction or employee retention.
While a perfect profit-focused algorithm may be able to factor everything into the long term value, humans don't think like that. It's simpler to focus on following certain values, such as always providing good customer service, even if it's more expensive in the short-term.

A country cannot run on companies providing products or services based on their goodwill alone. There will always be freeloaders, and ultimately people need competition and a profit motive to try their hardest. However a country cannot flourish if every company is entirely focused on maximizing profits. In a healthy economy, the companies that focus on providing a benefit to their customers (and society) are the companies that will succeed.

A culture of good
In short, it's simplest for the employee or company to keep a daily focus on doing good, while also keeping an eye on the metrics they're measured by (whether performance evaluations or profits). The company itself certainly wants employees to focus on doing good work, and countries certainly want companies to focus on providing good products and services.

How can companies and countries encourage good behavior? A large part of it comes down to cultural norms and expectations:
  • Joe joins a company where everyone only focuses on what's explicitly measured in their performance evaluation. Joe can then turn down all work not connected to his evaluation and no one will think less of him. 
  • Joe joins a company where people fix whatever needs to be fixed and help each other out. If Joe only works for his evaluation, employees may look down on him, and that can even end up harming his evaluation.
A large part of culture is self-reinforcing so it's important for a company to get this right early and hire helpful employees and encourage them to be helpful.

Why is good successful?
If doing good doesn't capture all of the ways an employee or company is evaluated, why is it the best thing to focus on? I think this also relates to culture and what people value:
  • If an employee does good work but misses some performance technicality, a healthy company will still give a positive evaluation. 
  • If a company is known to do fraud or bribery, it will be harder for them to retain employees and customers in a society that values justice.
In a country where bribery is the norm, there's much less risk from participating in it, since people expect everyone to do it anyway. This is why it's important to build good culturural norms. However it's hard for governments to control culture; unlike companies they don't even get to pick their members.

Not the only reason to do good
The above all focused on achieving success, but one should do good for its own sake. By doing good one can achieve more meaning and happiness than from material success. And only doing good is fully within one's control.

If people do good for its own sake, they will also do good to complete strangers, even if it can't be paid back. The doer won't get any reward from this but for their own eudaimonia. And the country or world with more of such people will flourish.

Ethical systems
Now that we've come to ethics for its own sake, what ethical system should one follow? If one wants the best outcome for the world, it seems one should be consequentialist. A perfect consequentialist algorithm could calculate the optimal action in every case to bring the greatest good to the most people. However humans are not robots, and it's easy to use consequentialist thinking to justify bad behavior:
  • It's OK to steal a little from the wealthy, I'll get more benefit from it anyways.
  • It's OK to trespass this private property, I'm not hurting anyone.
An algorithm could correctly factor in how even small actions of stealing and trespassing add up to worse consequences overall. But for humans, it's simpler to just follow ethical rules or try to live virtuously.


Wednesday, July 1, 2020

Biology to Learn

This is the sixth posts in the things to learn series. See the intro or the last post about biology vs. physics. This post lists interesting questions and topics in biology.
  • What is life? 
  • DNA and Genes
    • Expression - How does the genetic message go from DNA to RNA to proteins?
      • How do things like genetic dominance work at the chemical level?
    • Reproduction - How does DNA replicate? How does it ensure variation? It's almost paradoxical how much effort life spends to preserve DNA and then also to mix it up. 
      • Multiple swaps happen during meiosis
      • How are traits inherited? (From Mendelian single-gene traits to more complex multi-gene traits)
    • Differentiation - How do cells differentiate during fetal development?
      • Initial impetus based on amount of fluid detected in egg/fetus, which then sets off chain reaction where genes signal to other genes. (Seems almost recursive. How did this process evolve?)
    • A bit on modern techniques for editing DNA
      • Old tech to transfer genes from one organism to another
      • CRISPR
    • Bigger picture of genetic differences. What does it mean that humans share ~50% of their DNA with a banana or 99.9% of their DNA with each other? How much do people differ from each other? What does that mean? How relevant is the non-coding DNA. 
      • Seems us humans are not really 99.9% the same. Even just in coding DNA, letter differences change whole words and CNVs repeat words.
    • Practical things can one learn from getting your DNA test 
    • What genes led humans to be so different than e.g. chimpanzees. How a small number of genes can make a large difference in the brain's development. How non-coding DNA affects things. 
  • Evolution
    • Quantitative evolution -  Rates of mutations of DNA of different organisms. How long it takes for an adaptive gene to spread in a population. To what extent can the path of evolution be traced?
    • The possible origins of the first life
    • The role of epigenetics 
    • Philosophy of evolution
    • What level evolution occurs at and how animals cooperate (see The Selfish Gene)
    • Evolutionary psychology - how much actual evidence vs. speculation. Seems in many areas the brain is general purpose and people can adapt without genetic mutations.
      • Related: philosophical interpretations of human nature
  • The brain
    • How can thoughts and memories arise from neurons? (This is understood to a certain extent.)
    • How does consciousness work? (Difficult question!)
      • How do Buddhist meditative views on consciousness relate to the scientific nature of the brain. (See Why Buddhism is True)
      • To what extent are different animals conscious? Very simple animals (e.g. hydras) are not, and mammals appear to be but what about in-between?
    • How did and does the brain develop (evolution, culture, nature, nurture)
    • What happens to the brain during sleep?
      • Why is it so important for health?
      • Can dreams be interpreted as random neurons firing?
    • To what extent is the brain hardwired when born vs. a system that learns? 
      • Brain starts in very flexible state, but people eventually lose the ability to learn things like vision and speech. Some people can control extra fingers (See polydactyly.) What else could be wired to brain? Brain needs to be general purpose to have evolved.
    • Computational neuroscience - how does the brain compare to artificial neural networks? Besides direct neurons firing, what else in the brain is used for processing?
    • Behavioral neuroscience - To what extent does understanding the physical mechanisms of the brain help with understanding human psychology? In general, can the mind be viewed as a fully operating layer or are there many leaky abstractions?
  • The human body and practical health
    • Digestion and nutrition
      • What makes a balanced diet?
      • Metabolism rates and and people's weights. How would skinny people have fared in hungrier times? (See also The Hungry Brain)
    • Infection and disease
      • how bacteria and viruses spread
      • how the layers of the immune system works
      • how allergies develop and why they're more common now
    • Exercise
      • Why it's beneficial
      • What practices for most benefits?
      • How muscles strengthen and weaken 
    • Answering health questions - the fundamentals to know + search skills to find answers
    • The connection between psychological wellbeing and physical health
    • Modern world - evaluating the risks that new substances (e.g. Teflon, BPA) may pose to human health
    • Teeth - how cavities develop and best practices for preventing them
      • Besides sugar, which foods are most harmful? How long does it take for decaying processes to start occurring? 
      • Can one reduce prevent the mouth from being colonized by harmful bacteria?
      • Does flossing work in practice? What are alternatives
      • What other treatments exist (e.g. Silver diammine fluoride)
    • Sleep - what happens in the body during sleep, best practices for sleep
  • Big picture topics 

Sunday, June 28, 2020

Biology vs. Physics

This is the fifth post in the series on things to learn. See the intro or the last post on learning physics.

The natural sciences are divided into two branches: the physical sciences (primarily physics and its derivatives) and the life sciences (a.k.a biology). Biology is different than physics in many ways, which affect how one learns it:
  • Less Math - Math is fundamental to all of physics but it's more incidental in biology. This can make biology easier to learn for many people.
  • More complexity - As challenging as physics is, it's ultimately about simple concepts. But biology is about life, which is complicated.
    • Textbooks filled with terminology and small details can make learning biology more tedious. However I think there may be a way to focus more on the overall concepts involved than on the exact terminology and details. When learning for general curiosity, you don't need to know every exact term, you can just learn the terms that will be repeated enough to be worth learning. (See XKCD's thing explainer for an exaggerated example of explaining concepts with less terminology.)
  • Unknown frontier - Physics has already solved most areas that a layman would be interested in. The current frontier of physics deals with problems that would be hard a non-physicist to relate to, and it would take years of learning to understand them. Meanwhile biology is filled with unsolved questions in every area from neuroscience to nutrition to genetics to diseases, and one encounters these issues right away. 
    • Update: this point is debatable since there are unsolved questions in physics that a layman would be interested in.
  • Practical - If you're not an engineer you're unlikely to use knowledge of physics for anything practical. But biology topics like nutrition and disease are relevant to living longer and healthier lives.
There are other ways that physics and biology differ:

Inherent or accidental?
It seems that many parts of physics could be intuited based on other principles and couldn't be any other way:
  • Falling objects - Galileo argued against the Aristotelian idea of motion (that heavier objects fall faster) not only with experiments but by pointing out the logical paradoxes that would result.
  • Inverse-square law - While one could imagine forces decreasing in other ratios, decreasing in proportion to r2 seems the most logical since a force radiating out from a point will spread out according to the formula for a sphere's surface (4πr2).
  • Relativity - While most people wouldn't intuitively think of Special Relativity, it seems Einstein was able to recognize that it was the "only way" possible. He was able to derive this based on a deep understanding of the implication's of Maxwell's equations, and he may not even have been aware of the Michelson-Morley experiments.
Questions in physics are still resolved through experiments, but maybe this is to demonstrate the truth to those who don't have the right intuitions of the way nature "needs" to be. When Einstein was asked what if the experiments had disproven his theory of General Relativity, he said "then I would have felt sorry for the dear Lord. The theory is correct." While physics cannot just be pure deduction like mathematics, it's the closest one can get. The eventual goal of physics is to find the theory of everything from which everything else is derived.

Biology however deals with the complex messiness of life, and there's many ways to be a living thing. Scientists can may make predictions based on the data they have, but they can't derive how systems "must" be. Living things are "accidental" in the Aristotelean sense of having traits that they happen to have but could lack.

Purpose 
Ancient and medieval physics used teleological explanations as Aristotle emphasized the "final cause" (or purpose) as one of the "four causes" to explain the way things are, and argued against Democritus who rejected it. Modern physics, starting with Francis Bacon, returned to the physics of Democritus and dropped "purpose" from consideration. Since Isaac Newton, the motion of heavenly and earthly bodies is explained with simple physical laws, without reference to any goal or "natural place" of matter. 

Unlike rocks or stars, living things act with purpose. Even a simple bacterium seeks food, evades predators and maintains its internal state. While scientists no longer use theological explanations to explain why organs and organelles have certain functions and designs, these elements still exist and are worthy of explanation. Some use the term teleonomy to distinguish modern explanations of biological purpose from earlier ones.

In short physics is about mathematical explanations for "simple" things from atoms to galaxies, while biology is about the complexity of life, with all its purpose. 



Tuesday, June 16, 2020

Learning the Physical Sciences

This is the fourth post in the series on things to learn. See the intro or the posts on math and software development.

Should Studying Science be Mandated?
Most people won't become scientists so learning science is about satisfying curiosity about how the world works and came to be, not about learning a practical or career-oriented topic. Beyond the most essential understanding of how the word works, the physical sciences should be an optional part of the K-12 curriculum. Students who are interested in science can be encouraged to learn it since some of them may appreciate the opportunity and a fraction of them will later use it in their careers. Those who are uninterested are unlikely to become scientists themselves, but they can always catch up later if they desire to.

Once a student commits to learning a topic in high school or college, they can force themselves to continue learning it even when it's difficult, since they want to do well in the course. This is the one benefit of schools - they provide a structure or incentive system where people can learn. Once someone leaves school and is just learning on the side for enlightenment, they're less likely to "force" themselves through difficult topics. However, when you're learning on your own, you can choose to learn the most interesting topics.

Learning the Concepts in Science
If you're learning science just to satisfy curiosity, you don't need to learn every technical detail covered in textbooks.

Q: Can you learn physics without advanced math?
A: I think so:
  • Many areas of physics (such as mechanics) can be understood with basic algebra and maybe a sprinkle of simple calculus.
  • Even in other areas, it seems one can get at at least a partial conceptual understanding without covering all the mathematical details.
While a researcher or engineer may need to know all the mathematical nitty gritty, someone learning physics for knowledge can likely skip over some of these details. In the past it was even possible to make significant discoveries in physics with limited knowledge of math. For example Michael Faraday was "one one of the most influential scientists in history" despite the fact that "his mathematical abilities... did not extend as far as trigonometry and were limited to the simplest algebra". (Though even there, James Maxwell's equations were needed to fully understand the implications of Faraday's discoveries.) Physics became more complex over time, so later developments in physics require more math to truly understand them, but one can still learn a simpler version of any topic.

Books that cover concepts in Physics
These are books that give an overview of physics and its development:
  • Seven Ideas That Shook the Universe - different paradigms in physics: Copernican astronomy, Newtonian mechanics, energy and entropy, relativity, quantum theory and conservation principles & symmetries.
  • The Evolution of Physics (By Albert Einstein and Leopold Infeld) - As summarized by the table of contents, it covers The Rise of The Mechanical View; The Decline of the Mechanical View; Field, Relativity; and Quanta. Slightly similar to the above book, though from Einstein's perspective.
  • The Character of Physical Law (by Richard Feynman) - Instead of covering all of physics, it goes through certain ideas as examples of physics. This is the written version of a series of lectures by Feynman so it isn't as edited as the above books, but it contains Feynman's unique style.

Specific Topics in Physics
Here are some interesting topics in physics they seem worth learning more about.
  • Mechanics - Force & Motion & Inertia
    • The basic formulas and their calculus.
      • Example question: Intuitively, why is Kinetic Energy (KE) proportional to v2when momentum is proportional to v (velocity)?
        Answer: Lets' say you want to stop a frictionless moving car by putting a friction block on which drags on the ground with a constant force. A car going 2x as fast will take 2x as much time to stop since, as expected since it has 2x the momentum. However it will take 4x as much distance to stop the car. All that distance involved the same rate of friction heat creation, so the car going 2x as fast must have 4x the KE. Similarly if you want to drop a block and have it go 2x as fast as another block, you'll need to raise it to 4x the height. This was also a controversy between followers of Newton and Leibniz, see Vis Viva.
    • How/why is inertia and conservation of momentum so fundamental in all of physics?
  • Gravity (Newtonian)
    • How Newton discovered the law of gravity from a better understanding of motion.
      (I.e. how Newton built on Galileo to create his Newton's laws of motion, then connected them with Kepler's laws of planets and then connected that with the moon's motion and universal gravitation.)
    • Basic math of satellites and planets in orbit
    • Key concepts in general relativity
  • Electromagnetism
    • Understanding what electric and magnetic fields are are and how they interact with charged particles.
    • How special relativity resolved issues raised by Maxwell's equations. 
      • Interesting when reading Einstein's writings, how strong his intuition was to avoid any special frames of reference and how this took priority over other intuitive ideas such as about absolute time...
  • Thermodynamics
    • What is entropy? Besides the fundamental meaning for particles, how does it affect non-thermodynamic order? Whats was the entropy of the universe initially? How does gravity affect entropy? (See also heat death paradox, as well as this question.)
      Understanding Physics (by Isaac Asimov) gives basic explanation the laws of thermodynamics. First law is about the "absolute" store of energy. But energy can only be used when it flows from "high" to "low". And over time differences even out so entropy increases. Book has this more philosophical observation:
      We thus find there is an odd and rather paradoxical symmetry to this book. We began with the Greek philosophers making the first systematic’ attempt to establish the generalizations underlying the order of the universe. They were sure that such an order, basically simple and comprehensible, existed. As a result of the continuing line of thought to which they gave rise, such generalizations were indeed discovered. And of these, the most powerful of all the generalizations yet discovered — the first two laws of thermodynamics — succeed in demonstrating that the order of the universe is, first and foremost, a perpetually increasing disorder.
  • How does "information" as a physical concept connect to this? (see wikipedia and stanford article.)
    • Is the second law of thermodynamics more "proven" than other natural laws?
    • How the theoretical science developed from the technological development of steam engines (and compare with how computers developed) 
    • Practical applications in everyday life (e.g opening fridge won't cool room)
  • Nuclear physics
    • The nuclear bonds (and how E=MC2 not that relevant).
      Compare nuclear bonds with chemical energy.
      (Bonus: the weak force and how it relates to electromagnetic force) 
  • Quantum mechanics - to what extent can it be understood by a layman?
Other topics in the physical sciences
  • Astronomy & astrophysics - How the universe developed
    The formation of all elements (Stellar nucleosynthesis). The cycle of stars. How matter regrouped after stars exploded.. (See Wikipedia on Stellar population.)
  • Chemistry
    • how does the number of protons/electrons determine the properties of elements?
      • Much of this is more basic chemistry, as seen in repetition in the periodic table
      • Sometimes the specifics of how properties like color are determined can involve more complex areas, e.g. need relativistic quantum mechanics to explain why gold is gold-colored instead of silver. 
    • How does the structure of electrons in chemical compounds determine their properties? 
  • Earth science
    • Development of earth
    • Earth's magnetism
    • Global warming

Thursday, June 11, 2020

Skills to Learn for Software Developers and Others

The previous post discussed math topics I'm interested in learning, this will discuss programming-related skills that are important and I'd like to improve at.

While there are many technical skills important for software developers, this post will cover general (non-programming) skills, and programming skills that are useful for other careers.

General skills
These are general skills that are important in software development and in many other office jobs as well:
  • Focus - Often one encounters difficulties and it's easy to get frustrated and distracted. The test is still failing? Might as well browse emails or the web. But switching tasks breaks up the train of thought you had so you'll take even longer to solve the problem. (One second, just going to check my emails. Now where was I..? ) Often one needs relentless focus on an issue in order to make progress quickly. And not just "guess and check" thinking where you randomly try different things hoping you'll find a solution, but "binary search" thinking where you hone in on the issue until it's solved. There are times when it can be helpful to take a break and return to the problem later, but that should be done after you've given the problem solid focus and hit a wall. 
  • Typing 
    • While raw typing speed should never be a significant bottleneck when programming, any effort on typing or fixing typos can take your focus off the main issue at hand.
    • Programmers type far more chats and emails than actual code; it's best to do this as quickly as possible.
    • Besides basic typing skills, one should also be comfortable with the relevant keyboard shortcuts for their OS, terminal and IDE. Moving to the mouse is another micro distraction that is best avoided. 
  • Memory / note system - When learning programming one struggles with remembering all sorts of details about language and syntax, but eventually you get the overall hang of how things work, and can easily look up syntax as needed. But there will still be many issues that you solve (or get help with) where you'll want to remember the solution for the future, and your memory isn't always enough. It's useful to have a note or bookmark system to quickly lookup how to do things.
General programming skills 
These are programming skills that are useful for many jobs, not just for professional software developers: 
  • SQL - The world is built on SQL, often with a few other layers stacked on top of it. Besides writing SQL when developing an actual application, it's essential in many other cases such as:
    • analyzing experiments or general usage of a product
    • finding sample data to test something out
    • querying logs to debug an issue in production
Many alternatives to SQL have been developed, but there's often no avoiding SQL itself. It helps to become proficient with it so one can quickly find the data they need and avoid common bugs such as accidentally duplicating rows. Many other professions, such as analysts or product managers, will also find it useful.
  • Regex - Programming is often about finding the right example to base your code on, or about quickly finding and replacing text. Regex makes this faster. Anyone who deals with large data or texts will find it helpful as well. 
  • Scripting - Sometimes it's useful to write a quick script to help generate code or analyze data. Non-professional programmers may want to write a script to help with their science research or with their spreadsheets.
Worth learning
While one can learn many skills on the job, often it's helpful to take a step back and learn the subject in-depth. This way you can learn how to do something properly instead of just finding the easiest solution at the time. This would be an area where schools could help, but as expected, they don't give these subjects their proper due.

Tuesday, June 2, 2020

Maths to Learn

In the previous post I discussed the five categories of knowledge. These posts will go through different subjects that I'm interested in, starting with Mathematics.

  • Theory of computation - key ideas of computation. It's interesting how a mathematical idea about computation grew into physical computers.
    • Turing and Godel's theorems and how they relate to each other. Is there a way one can exclude the halting problem and build a machine that can determine if almost everything will halt?
    • How high level code actually executes on a machine.
  • Review of basic calculus
    • Intuitive understanding of derivatives and integrals.
    • Optimization and related-rates problems
    • Applications to physics
  • Probability & statistics - Ultimately all knowledge comes down to probabilities. Statistics are useful for interpreting studies and experiments and everything else.
    • Review fundamentals of probability
    • Bayesian probability and statistics
    • Pascal's triangle, the normal distribution, the central limit theorem
    • Applying statistics to real-world examples
    • Tools for stats (e.g Google sheets, R, Python)
    • Stats for machine learning
  • Using Mathematica for real world math problems


Besides for calculus, it's interesting how little these topics are taught in schools. Many students don't know basic topics like fractions well and schools should focus on teaching them better. Other students can learn more advanced topics but it does not need to be limited to a narrow curriculum of trigonometry and geometry and specific parts of algebra. (See also my post from 2011.)

Carthago delenda est 

Monday, June 1, 2020

The Case for The Case against Education

In The Case against Education Bryan Caplan argues that education is primarily about signaling certain traits as opposed to learning useful skills, and that much of it is a waste for society. Here are some of my thoughts on the book:

  • Agree with much of the book. It shouldn't surprise most people to hear that schools teach a lot of useless stuff.
  • Caplan focuses on the US but it would be interesting to look at other countries. For example, Caplan dismisses online education as unlikely to become accepted by employers, but Open University is a remote learning option founded in 1969 that is a respectable option in many countries.
  • Caplan's big claim is that schools mainly signal certain traits (such as intelligence and conscientiousness), and he particularly emphasizes that schools signal "conformity" and that employers care strongly about it. I think this depends a lot on industry and the culture of the companies. For example, tech companies seem less concerned about conformity, though perhaps that's why many of them don't require college degrees. Other companies may still require degrees but they may just be conformist themselves without actually requiring conformists for the job. If it became more accepted to not go to college and to hire without degrees, how many companies would still insist on it?
  • Instead of just theorizing about what employers are looking for, it would be interesting to actually check. Big companies have specific criteria they look for when hiring applicants, and they also study the traits of their successful employees. For example, see this article on Google's hiring practices.
  • Not sure if Caplan gets this critique too often but in certain cases I think he gives school too much credit. For example, he says practical majors like engineering primarily involve learning useful skills. In my experience with Computer Science, much of the major consisted of theoretical math instead of practical topics. (That's why there's a practical-focused programming course called The Missing Semester of Your CS Education).
  • Caplan says some pretty extreme things, such as saying there should be zero government funding of education, or that it would be better if education was more expensive. As if the cost of education in America isn't high enough! There are better ways to beat credential inflation than making education more expensive, and ways that would be less unfair to lower-income people. For example, one could encourage companies to do more interviewing or hiring on a college-blind basis (I think the hiring platform triplebyte tried this to some extent.)
  • This may be an issue in general with books, but I'm not sure how much I remember from the middle of the book. I think people can just read the beginning and end of the book to get the gist of it.

(Review originally posted to Reddit.)

Sunday, May 31, 2020

Areas of Knowledge and Education

There are many topics I'm interested in learning about but they fall into five categories:
  • Theoretical topics (e.g science) - It's interesting to learn about the world even if I probably won't make new scientific discoveries myself. This category is the main focus of schools, but usually without checking whether the students are interested in learning them.
  • Practical relevance to world (e.g politics) - learning these topics isn't something I can practically use, but it can affect how I vote, and one individual can influence other people. Even one blog post can have an impact!
  • Practical skills - These can be a specific skill relevant to one's career (e.g software development) or generally useful skills (e.g. typing). This could also include general-purpose abilities such as thinking rationally. 
  • Interpersonal skills (e.g public speaking) - these are also the subject of many self-help books
  • 'Intrapersonal' skills - this includes practical areas like time management and learning techniques, as well as skills for living a happier or more meaningful life, such as Stoicism or meditation. This could potentially include being a good person more generally.
Schools focus on theoretical topics but have very few classes dedicated to Interpersonal and Intrapersonal skills. Most people don't manages to pick up all these skills on their own so it's something a good education system could potentially help with.

Note that many subjects can contain topics in multiple categories. For example you can learn algorithms to practically apply them and also learn the mathematical theory behind them. While some theory may be always be required, a good education system would let people choose to focus on more practical areas if they prefer.

In future posts I'll outline actual topics in more detail and what I'd like to learn more about them. While it will be a personal list, these outlines could also serve as a potential topics that educational systems could offer.

Wednesday, May 27, 2020

Silver Bullets in Software Development

No Silver Bullet
In the 1986 Essay No Silver Bullet, Fred Brooks argued that nothing would provide a tenfold improvement in software development within a decade:
But, as we look to the horizon of a decade hence, we see no silver bullet. There is no single development, in either technology or in management technique, that by itself promises even one order-of-magnitude improvement in productivity, in reliability, in simplicity. 
He divided software development into essential and accidental difficulties:
  • essential ones are required due to the complexity of the problem itself - the software needs to satisfy a "conceptual construct" in a precise manner
  • accidental ones (like determining the correct syntax) are incidental to the problem and can become simpler with better hardware and software techniques.
Accidental difficulties had been reduced to such an an extent by 1986 that Brooks argued most software development dealt with essential complexities that could not be removed. While incremental progress would be possible, revolutionary "silver bullets" were impossible. Brooks reiterated his claims in 1995, but it's worth revisiting again. Have there been any silver bullets since then? How much of software development today deals with essential vs. accidental problems?

Silver Bullets. Photo Credit: Money Metals, Flickr

Software Development Today
On one hand there has been a tremendous amount of progress in speeding up development and in focusing on the essential problems:
  • Google and StackOverflow let one quickly find answers to questions
  • Open source libraries allow for broad code re-use
  • Cloud services like AWS make it easier to launch in production
  • Frameworks like Ruby on Rails provide default assumptions so the engineer can focus on defining the product
On the other hand it seems like much of engineering work today, particularly at large companies, deals with complex issues not connected directly to defining a product:
  • As products grow to encompass multiple teams, applications may be split into sub-applications for each team, but integrating them together adds additional layers of complexity
  • Integration tests involve so many systems that they're a constant point of failure, and often adding or updating a feature can require more time dealing with tests than with the actual code
  • As products grow larger and scale to more users, engineers spend more time on smaller optimizations
The move from desktop applications to the web also added new layers of complexity:
  • Application logic needs to be replicated on both the server and client side 
  • Every language and framework needs to be converted to Javascript, an unusual choice for an "assembly" language
  • Since application data isn't generally stored on the client, latency becomes a constant issue
Depending on where one draws the dividing line, these problems can be considered either essential or accidental. While they do not deal with specifying the product itself, they arise from the size of the teams or from the technologies involved. Software development still deals with both the essential aspects of specifying a product and the many nuts and bolts of making it work correctly in the real world. 

A Silver Bullet 
There is a silver bullet that has completely revolutionized development - machine learning. Brooks had specifically dismissed AI as a silver bullet since back then AI meant "heuristic" or rule-based programming, where each product would still need all its details specified:
The techniques used for speech recognition seem to have little in common with those used for image recognition, and both are different from those used in expert systems. I have a hard time seeing how image recognition, for example, will make any appreciable difference in programming practice. The same problem is true of speech recognition. The hard thing about building software is deciding what one wants to say, not saying it. No facilitation of expression can give more than marginal gains.
Enter machine learning (ML), particularly deep learning with neural networks. Now the same overall techniques can be used for both speech recognition and image recognition. One no longer needs to decide precisely what "one wants to say", one just specifies a goal and given enough data, the neural networks will figure out the details. Systems that involved years of coding before can be replaced with a machine that learns on its own. For example, AlphaZero was able to learn chess by playing itself for a few hours and then it beat the best existing chess software. Programmers had spent decades improving chess software with hand-written heuristics, but machine learning outplayed them all. 

What's next
Despite the amazing progress of ML, most areas of software development do not have enough data to truly benefit from it, so they still have the same overall structure and process as years ago. What then are the next areas of progress?
  • Assisted programming - generating a program from a product definition has always been a dream (even mentioned by Brooks), and there's been recent progress. For the near future, humans are still needed to specify the nitty gritty details in code. But online resources like StackOverflow and Github (besides company's internal codebases) contain enough data that search and ML algorithms will be able to assist in this process. A significant part of programming can be finding an example and modifying it for one's purpose, so even a better search alone will speed up overall development.
  • Much of programming consists of plumbing - connecting databases to an application, determining how to summarize the data, deciding how to display it in a UI. Since some of this is very standardized, companies can choose to use "low code" tools to build them, using products built for that purpose (e.g. from Salesforce) or even just advanced spreadsheets (Airtable). While visual programing does not contain the power and flexibility for building large applications, some products have much smaller scopes.
  • Some application plumbing will no longer be necessary for other reasons - ML will take over optimizing certain goals from humans, so a user interface will no longer be needed. For example, when an ad campaign runs on ML, much less knobs and dials need to be created for users. The system just takes in a budget and perhaps a goal to optimize for. In some cases, developers may still create tools for users to interface with the ML system, but in other cases the system will be a fully automated blackbox. Developing user interfaces might remain the same, but what interfaces are needed will change.
In short, software development will continue to make incremental progress in some areas and add accidental complexity in other areas, while some areas will be completely revolutionized by ML.





Sunday, May 24, 2020

Bryan Caplan on Who to Blame for Poverty

Bryan Caplan spoke today at an online SlateStarCodex / LessWrong meetup about his upcoming book Poverty: Who To Blame. Here's a brief summary of his talk and book, with my comments in italics. (Note: I think this summary is mostly accurate but I missed some of the Q&A and may have missed or misunderstood other parts.)

He first discussed consequentialism:
  • He recognized that his rationalist audience is mostly consequentialist / utilitarian, but he doesn't think anyone actually lives like that; in practice people care about what others deserve. If your friend needs help, you'll help them more if they didn't bring the bad circumstances on themself, people care about "just deserts".
    • It seems this could be factored in to a consequentialist framework as a way to encourage better behavior from your friends.
  • In discussions of policy, consequentialists often just focus on the [short-term] outcomes and don't care about the value of letting people help themselves. While consequentialists could in theory care about this, in practice they don't.
    • Why are people being too consequentialist in practice now when they weren't in the last bullet? I think Caplan thinks people are too short-termed consequentialist when it comes to public policy, but not in their own lives. If this short-term thinking is a real issue, maybe consequentialists should adopt virtue ethics for its better consequential outcomes. ~
He then discussed the topics from his upcoming book.
  • Corporations - People think that third-world countries suffer from the exploitation of international corporations, but in practice there isn't that much formal employment in poorer countries overall, and even less from the large companies. Instead, most employment is "incompetent self-employment" where people run their own business even though they don't have the skills to do so. (Interesting data, though maybe it could be called "sub-optimal self-employment".) In reality, large companies help a country since they pay more wages than people would otherwise earn and provide better job security. Most of the progress in the last 50 years is from normal economic development, not philanthropy. 
  • Housing regulation - people know how much regulation affects costs in the Bay area, but there's similar issues throughout the world. Even India (which has high rates of homelessness) over-regulates housing. The iron-fisted governments prevents people from helping themselves. There should not be zoning laws that prevent tall buildings or multifamily homes. Some cities are getting so expensive that people are moving to cheaper less productive cities.
    • Caplan thinks that neither cities nor countries should restrict who can move in. It seems reasonable to reduce many zoning restrictions in cities, though there can be collective risks from concentrating so many people in one area. For example, New York has not fared well in this pandemic, imagine if they were even more crowded.
    • Expensive cities aren't necessarily more productive, they often pay higher wages for the same output due to higher cost of living and market conditions. If remote work continues to grow, it's possible that cities (and even countries) may not continue to have as disparate wages in the future.
  • World poverty - As discussed in his book "Open Borders", the first world makes poverty worse in the third world by not letting people immigrate freely to the first world. When someone moves to the US they can make 10x more and be 10x as productive. If you focus on why a person is poor, you'll often find it's laws that caused it. If a father favored his son in a competition, people think it's unfair, why do they allow countries to favor their own citizens. Kuwait is better than the West in this matter, they let people immigrate, they just don't give them benefits.
    • A father can favor his son for his own company, so it seems reasonable that a country can favor their own citizens for the country's own resources. While a pure effective altruist would treat everyone in the world the same, people generally favor their own citizens over citizens of other countries (just like people generally care about "just deserts"). One could even argue that if countries weren't responsible for their own citizens, it would create the wrong national incentives since governments could offload their troubles elsewhere (though this would depend on what benefits the other countries provide to immigrants.)
    • Liberal democracies would feel uncomfortable not giving benefits like welfare to legal immigrants, but Caplan the libertarian has less qualms about it, and here it seems his position would help the poor more. But realistically Western countries are unlikely to create separate classes like Kuwait, and otherwise the math of paying for all the benefits can't work out.
He then discussed the most controversial part of his book, the responsibility the poor themselves have.
  • When you bring up blame, people say you're blaming the victim, you're blaming the poor. But much of his book blames government regulations, not the poor.
  • However, one reason for poverty is irresponsible behavior from the poor themselves: The poor have the highest percentage who are out of the workforce, i.e. not even looking for a job. They have more irresponsible sexual behavior and more dangerous alcohol and substance abuse.
  • What should you do about people causing their own poverty? At minimum you should prioritize other people who are not causing own problems. But also you don't need to feel guilty for people who are messing up own lives.
He didn't discuss that many in the audience might not believe in free will so they would still think that you should help people who made poor choices. Regardless of one's views on free will, certainly individuals have different natures and nutures and some might struggle to succeed when other people find it easy. While it's fair to factor in how things like welfare can distort incentives to work, it seems extreme to not actively help the poor. 



Friday, May 22, 2020

Coronavirus - Evidence and Restrictions

  • When learning topics of theoretical interest, one should learn from established science. There's enough established science that's interesting, why bother with the speculative stuff? However, when something is practically relevant but the facts are not known, you can't just wait for things to be proven. You need to use the best info and probabilities you have and act accordingly. This is something many people fail to realize.
    • For example, the WHO initially said there was "no clear evidence of human-to-human transmission of the novel coronavirus". Even if there had been no solid evidence, it would still make sense to suspect human-to-human transmission and take proper precautions instead of "not recommend[ing] any specific health measures for travellers" and being "against the application of any travel or trade restrictions on China" (link).
    • The WHO recommendations about masks was even more "radically conservative". They continued to insist for months that there was no evidence that wearing masks would help prevent the spread of the virus. But one can't wait for a double-blind study to test whether masks work. One needs to look at the empirical data available, such as the reduced spread of the virus in mask-wearing countries, or the best arguments available, such as the plausible reduction in the spread of droplets when people wear masks.
  • If people had acted earlier in those cases, many lives could have been saved. But it doesn't mean we should now go the opposite extreme and recommend everyone remain in total isolation for months.
    • There is reasonable evidence that the virus primarily spreads from being indoors with someone for a while or from things like shouting and singing (besides coughing and sneezing of course). People who are careful could still meet outside in certain cases.
    • Many people who live alone are both very unlikely to have the virus and very unlikely to spread it to the elderly or other high-risk people. Such individuals shouldn't feel like they're in solitary confinement but should be able to meet with specific individuals in a careful manner. 
    • Governments should not just add every restriction possible and think this will keep people safe. There's a Talmudic statement "כל המוסיף גורע" - "whoever adds [restrictions], detracts" since people will treat all restrictions in the same manner and not be careful even for the important ones. Government policies need to focus on strongly enforcing important restrictions while allowing other low-risk activity to resume.

Monday, May 18, 2020

Science - New or True?

  • People often quote news articles about recent developments in science as if a new Truth was discovered. Health studies are a particularly popular topic.
    Studies show ad lib
  • However, most studies are false, and most news articles (particularly headlines) misquote or exaggerate them, and most people misquote or exaggerate the news article or headline. The probability the person is saying something true is low, say 40% * 40% * 40% = 6% (± 5%). 
  • If one wanted to learn about the world, it would be better to learn more established science that is very likely to be True.
  • While books and courses are a good way to learn knowledge, people are often interested in short tidbits instead. For example, see all the blogs and magazines that just publish the same thing about cleaning your house or being productive.
  • It's less common that media or people discuss established facts in science. But people should feel free to publish, share and discuss interesting things they've found out about nature. While it may not be new it's more likely to be true.

Sunday, May 17, 2020

Book Review: Meditations

  • Meditations by Marcus Aurelius (Penguin Edition) is the personal diary of Marcus Aurelius, Roman emperor from 161 to 180 CE. It's interesting how an emperor from over 18 centuries ago could still be considered relevant today. He was a follower of the Stoic philosophy of life and much of the book consist of his exhortations to himself.
  • I wouldn't recommend this book as a practical guide to Stoicism however. While the Penguin translation is OK, translations can sound stilted especially when the original text is from 1800 years ago. Some of the book's examples can be hard to relate to today, and other times there are no examples at all. Since the book was just written as notes to himself, it's often disorganized and repetitive.
  • To get a gist of the work, you can just read excerpts from it, such as Chapter 2 or 9. If you don't care about using a more modern translation or footnotes, you can get free translations online (such as the MIT version). To get a modern take on Stoicism, I recommend A Guide to the Good Life: The Ancient Art of Stoic Joy.
  • Themes
    • Marcus had some doubts about his religious beliefs but argues that his principles are true either way. For example, he often mentions the question of whether nature is unified/intelligent (as the Stoics argued), or random atoms (as per the Epicureans).
    • The key principle of Stoicism is that external circumstances cannot determine how you feel, your own mind is in charge:
      > Today I escaped from all bothering circumstances - or rather I threw them out. They were nothing external, but inside me, my own judgements.
    • Another related principle is that nature is good and there's no reason to be upset about what happens, that's just the way things are. Marcus mentions this often about death, and the final chapter focuses on this topic.
    • He often mentions how you shouldn't get too upset about things since in the grand scheme does it really matter? Zoom out and see how small everything is. Also, everyone will be dead soon anyways, and forgotten. But don't worry about that since that's just the course of nature.
      In man's life his time is a mere instant, his existence a flux, his perception fogged, his whole bodily decomposition rotting, his mind a whirling, his fortune unpredictable, his fame unclear. To put it shortly: all things of the body stream away like a river, all things of the mind are dreams and delusion; life is warfare, and a visit in a strange land; the only lasting fame is oblivion.

      What then can escort us on our way? One thing, and one thing only: philosophy. This consists in keeping the divinity within us inviolate and free from harm, master of pleasure and pain, doing nothing without aim, truth, or integrity, and independent of others' action or failure to act. Further, accepting all that happens and is allotted to it as coming from that source which is its own origin: and at all times awaiting death with the glad confidence that it is nothing more than the dissolution of the elements of which ever living creature is composed. Now if there is nothing fearful for the elements themselves in their constant changing of each into another, why should one look anxiously in prospect at the change and dissolution of them all? This is in accordance with nature: and nothing harmful is in accordance with nature. (End of Ch. 2)
       (Review also posted on Goodreads)

Thursday, May 14, 2020

Google picking

  • P-hacking is fishing around in data until you find a "significant" p-value so you can find an exciting claim, publish your paper and get tenure
  • Let's define Google-picking as trying out different Google searches until you find a result that says what you want it to say
  • To deal with issues like P-hacking, some institutions now require publication of all experiments and analysis performed or pre-registration of the proposed studies
  • If someone cites an obscure internet result to support their claim, they could be suspected of Google picking and should be required to "publish" what searches they performed
  • In all cases, the claim needs to be evaluated on it's own merit (and one's own searches) regardless of how the data backing it was discovered

Wednesday, May 13, 2020

Studies on Studies on Slack

  • SlateStarCodex published a new essay on Studies on Slack, here's a summary with some examples highlighted:
    • He discusses how competitive pressure can make things improve, but if there's some "slack" from this pressure,  an organism or organization can pursue longer-term goals.
    • Starts by discussing evolution, but then gives examples in Capitalism, history, Civilization (the game), civilization (itself), the spread of ideas, etc. 
    • Gives example of Italy vs. Switzerland to suggest that maybe a little warfare can help with ideas.
    • Gives example of Sears where apparently the CEO thought there should be more internal competition, but it didn't work out.

  • I don't think the Italy vs. Switzerland example demonstrates much
    • it's not a large sample, there were many other differences between them, and the Swiss produced stuff too
    • It would always have been better and led to more progress if countries didn't kill each other and looked for other ways to compete, like the modern Western world does (or like Isaiah envisioned)
  • Trade is great but it also has many inefficiencies since it requires negotiations to get the right price and diligence to evaluate the service done, so adding too much intra-company competition sounds like a bad idea.
    • A company succeeds in part because people work together for a certain goal without trying to cut corners at every opportunity.
    • If everything is entirely based on getting a good performance evaluation, there will be too many attempts to game it. 
    • On the other hand, if there's no evaluations and the company just relies on selfless dedication, free-riders will bring the company down in the long term.
    • An economy overall requires even more "evaluation" (i.e negotiated prices) since it's an even larger group where people feel even less commitment to the collective and where there's less implicit evaluation of one's performance. This is why communism failed, but a small startup can succeed

Tuesday, May 12, 2020

Write something each day

  • I should try to post something short and quick each day
  • It can be a random thought or something interesting I've read
  • To make it easy, I'll just aim for 3-4 bullet points in each post and 3-4 posts a week