How much can we learn about AI with interpretability tools?
outline of answer:
-
How much can we learn with current interpretability tools
-
interpretability research is still very new, so there is a lot we can’t know
-
some examples of successfully interpretability
-
is interpretability only post facto justifaction
-
-
How much can we learn in principle