Measuring Scaling Performance

An exploration of strong and weak scaling in parallel computing, including their definitions, ideal scenarios, and practical considerations.

This article was originally published at drstarry.github.io.

Introduction

Measuring scalability[scaling efficiency] is common in Parallel Computing or HPC area. Two basic ways to measure are Strong Scaling and Weak Scaling, we will talk about them respectively.

Strong Scaling

Defined as

how the solution time varies with the number of processors for a fixed total problem size.

Ideally, we want to get the program N times faster done on N processors or other kinds of parallel unit. Speedup defined as S = T_1 / T_N. Linear Speedup occurs when S = N, ideally. However, due to Amdahl's Law, the parallel speedup doesn't asymptote to N as there is no speedup for serial parts.

For example in CPU bound-problem, it fits more on strong scaling. The ultimate goal is to find the reasonable computation without too many CPU stale cycles. However, generally it's still hard to achieve strong scaling with large amount of process, as we mentioned, the overhead of communication hurts the performance.

Weak Scaling

Defined as

how the solution time varies with the number of processors for a fixed problem size per processor

The ideal situation is to compute a N times bigger problem in same amount of time. However, we could not achieve this when the serial work increases accordingly when the problem size increases. Furthermore, the communication overhead among processors may grow correspondingly.

Weak scaling tends to work better

Getting N times the work done on N cores is still more feasible when

  • Small problem sizes keep every node of a local cluster busy
  • Your code has the scalability properties mentioned earlier
  • Easiest scenario: all cases are nearly totally independent of each other

Also as Gustafson's law redefines the efficiency, it instead tends to make full utilization of each computing unit. Then people shift to research on how to compute larger problem in same amount of time.

Reference

  1. Class Notes from Cornell
  2. Intro to Parallel Algorithms
  3. Scalability Wiki
  4. sharcnet blog