Linear Complexity $\mathcal{H}^2$ Direct Solver for Fine-Grained Parallel Architectures