Explainable AI
goal
Is the sole goal of XAI just detection of goal misgeneralisation?
Others would say, other useful goals include:
- lie detection
- capability enhancements
- increased compute efficiency
- debug training
- tripwires
Is the sole goal of XAI just detection of goal misgeneralisation?
Others would say, other useful goals include:
- lie detection
- capability enhancements
- increased compute efficiency
- debug training
- tripwires