How much can we learn about AI with interpretability tools?

outline of answer:

  1. How much can we learn with current interpretability tools

    1. interpretability research is still very new, so there is a lot we can’t know

    2. some examples of successfully interpretability

    3. is interpretability only post facto justifaction

  2. How much can we learn in principle