We propose a compositional approach to synthesize policies for networks of continuous-space stochastic control systems with unknown dynamics using model-free reinforcement learning (RL). The approach is based on implicitly abstracting each subsystem in the network with a finite Markov decision process with unknown transition probabilities, synthesizing a strategy for each... Show more