Questions for ‘How to design artificial intelligence that acts nice — and only nice’

an illustration of a robot hand grabbing a small planet earth, the image has 1s and 0s throughout to illustrate an online/digital environment

Could robots ever turn against humanity? That remains the stuff of science fiction. Yet even the bots we have now can cause harm in some ways. So researchers are working on ways to make them safer.

Cemile Bingol/DigitalVision Vectors/Getty Images

To accompany How to design artificial intelligence that acts nice — and only nice


Before Reading:

  1. Give one example of a movie, book or series involving robots or artificial intelligence (AI) gone wrong. Describe the plot in one sentence. Describe one aspect of this fictional work that you consider far-fetched, or very unlikely to ever occur in the real world. Consider the recent advancements in AI technology, then describe some potential problem — perhaps from this movie/book/series — that you think could become a reality if we do not take preventative steps.
  2. Think about the way you’ve used AI technology lately. Write down one ability carried out by AI that improves your life. Then, write down one ability you think an AI of the future should not have, even if it’s technically possible to grant this ability.

During Reading:

  1. When scientists in this story talk about solving the “alignment problem,” what goal are they trying to achieve?
  2. To what extent does the Center for AI Safety consider advanced AI to be a serious potential threat to humans? Explain your answer.
  3. List the “three H’s.”
  4. Give one example of a large language model.
  5. How do large language models make use of probabilities to write text?
  6. How did OpenAI teach its large language model what types of text it should not generate?
  7. What was OpenAI’s Minecraft bot attempting to accomplish when it chopped down a house?
  8. Briefly describe the bowl task used to test the AI system called KnowNo.
  9. This story describes many examples of how scientists and professional researchers help improve AI safety. But how might non-scientists contribute to this goal? Give one example.

After Reading:

  1. Jeff Hawkins says the fear of AI spontaneously taking over the world is “far-fetched.” Briefly explain his reasoning. Hawkins points out that humans give AI their motivations. In other words, an AI model will want to accomplish whatever it’s told to accomplish. Assuming it’s true that AI will not create its own motivations, how might problems still arise? Describe one approach from this story that aims to address this potential problem.
  2. What does it mean to say that humans have “values”? Ayanna Howard says that training AI with human values may be possible. What problem does she anticipate, even if this goal is achieved? What does this tell us about the importance of diversity in science?