Towards Benchmarking Explainable Artificial Intelligence Methods