In this book chapter, the authors discuss some important communication issues to obtain a highly scalable computing system. They consider the CGM (Coarse-Grained Multicomputer) model, a realistic computing model to obtain scalable parallel algorithms. The communication cost is modeled by the number of communication rounds and the objective is to design algorithms that require the minimum number of communication rounds. They discuss some important issues and make considerations of practical importance, based on our previous experience in the design and implementation of parallel algorithms. The first issue is the amount of data transmitted in a communication round. For a practical implementation to be successful they should attempt to minimize this amount, even when it is already within the limit allowed by the CGM model. The second issue concerns the trade-off between the number of communication rounds which the CGM attempts to minimize and the overall communication time taken in the communication rounds. Sometimes a larger number of communication rounds may actually reduce the total amount of data transmitted in the communications rounds. These two issues have guided us to present efficient parallel algorithms for the string similarity problem, used as an illustration.