Published

2023-10-29

Explainable AI

goal

Is the sole goal of XAI just detection of goal misgeneralisation?

Others would say, other useful goals include:
- lie detection
- capability enhancements
- increased compute efficiency
- debug training
- tripwires