NML

For versus Foreach

By

Whenever possible, use for instead of foreach.

Batman-Robin-1966-TV-Adam-West-Burt-Ward-Wallpaper-j

This is what I was told shortly after learning about loops in C#, and I blindly believed it. I had the idea, like many others, that lower level code means faster code.

But is it true? I decided to find out. To ensure an accurate and fair comparison of the two loops, the order in which they are tested is alternated. They are each tested 100 times and the very accurate Stopwatch class of the .NET framework is used for timing.

The following is the test code.

long forTime = 0;
long foreachTime = 0;
int dataSize = 100000000;
int[] data = new int[dataSize];
int a;

Stopwatch sw = new Stopwatch();
var random = new Random();

for (int i = 0; i < dataSize; i++)
{
  data[i] = random.Next(int.MaxValue);
}

for (int r = 0; r < 50; r++)
{
  sw.Start();

  for (int i = 0; i < data.Length; i++)
  {
    // access variable here
    a = data[i];
  }

  sw.Stop();
  forTime += sw.ElapsedTicks;
  sw.Reset();
  sw.Start();

  foreach (var num in data)
  {
    // access variable here
    a = num;
  }

  sw.Stop();
  foreachTime += sw.ElapsedTicks;
}

sw.Reset();

for (int r = 0; r < 50; r++)
{
  sw.Start();

  foreach (var num in data)
  {
    // access variable here
    a = num;
  }

  sw.Stop();
  foreachTime += sw.ElapsedTicks;

  sw.Reset();
  sw.Start();

  for (int i = 0; i < data.Length; i++)
  {
    // access variable here
    a = data[i];
  }
 
  sw.Stop();
  forTime += sw.ElapsedTicks;
}

The line below the “access variable here” comment is repeated for the number of times we want to access the variable in each iteration. I changed the number of times it was read for different tests, but was always consistent within each run of the code. The “datasize” variable is the size of the array that we iterate through. The time is in milliseconds and is rounded off to 4 decimal places as any more precision than that is insignificant. The time is the average run time of the loop.

Here are the results:

</tr> </table> **First things first**, performance should only be a consideration if you are dealing with about one million or more items. Any less than that and the difference between the two loops is insignificant. Now for the more interesting observation. If the variable is being accessed only once per iteration the _for_ loop is faster, however if you are accessing that variable more than once the _foreach_ loop starts to pull ahead. As the number of access times increases, the amount by which the _foreach_ loop is faster than the _for_ loop becomes greater. Now next time someone tells you to use _for_ instead of _foreach_ for your list of 10 items, ask them why and see what explanation they dream up. What we can learn from this is that people can’t be trusted testing for yourself is always a good idea. Even if you end up proving yourself wrong, from that point on you can confidently code the better way. Don’t take my word for it though. Go test the loops for yourself. See what results you get and research why. **It’s never a bad time to learn something new.** So I’ll end with the famous aphorism from David Wheeler (which motivated me to do this test) that I heard repeated by C++ creator, Bjarne Stroustrup, in a talk about how low-level code is rarely faster than high level code: > All problems in computer science can be solved by another level of indirection except for the problem of too many layers of indirection”.
Data size Times accessed for time (ms) foreach time (ms)
1 000 1 0.0104 0.0112
10 000 1 0.1472 0.1530
100 000 1 1.2299 1.2885
1 000 000 1 10.3899 11.1161
10 000 000 1 106.8881 114.6377
100 000 000 1 1063.3170 1137.2804
1 000 2 0.0208 0.0207
10 000 2 0.1313 0.1312
100 000 2 1.3534 1.3296
1 000 000 2 13.3573 13.3566
10 000 000 2 134.1247 134.0124
100 000 000 2 1361,4287 1355,4975
1 000 3 0.0267 0.0240
10 000 3 0.1601 0.1539
100 000 3 1.8167 1.6782
1 000 000 3 16.3982 15.8385
10 000 000 3 162.6916 157.1088
100 000 000 3 1631.8420 1576.4689