LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case