How much can we learn about AI with interpretability tools?
outline of answer:
- How much can we learn with current interpretability tools
- interpretability research is still very new, so there is a lot we can’t know
- some examples of successfully interpretability
- is interpretability only post facto justifaction
- How much can we learn in principle