I was today attending LabVIEW Developer Education Day in Helsinki (Espoo), Finland. The NI application engineer was introducing the new parallel for-loop structure released in LabVIEW 2009. The idea is that a for-loop iterations can be run in parallel when the loop iterations do not depend on one another. The concept is nice and I have been waiting for NI to introduce such a concept for some years. I was positively surprised when I noticed that LabVIEW 2009 is shipping with this new feature.
The introduced parallel for-loop was not exactly what I was hoping for. I was hoping that the LabVIEW compiler would automatically parallelize a parallelizable loops, after all it is theoretically a task that the compiler or the runtime environment could do. However it may be that implementing such compilation technique with current LabVIEW runtime scheduler could have been too difficult. In the current implementation introduced in LabVIEW 2009, the programmer needs to configure the for-loop parallelism by defining the number of instances to work parallel on loop iterations. The number of instances can also be defined at runtime, or at least that is how I understood it.
So I decided to give parallel for-loops a try and test their limits. Do they really function the same way as if you would place the same code in parallel on the block diagram or are there some shortcomings you should be aware of. The first test I decided to write wire is to make the loop iterations to depend on one another by using shared queues. I made two parallel for loops. The first loop inserts elements to first set of queues and then waits for the elements to appear on the second set of queues. The second loop gets the elements from the first set of queues and then inserts the same elements to the second set of queues. I set the both loops to run in parallel and set the number of workers to equal to the number of iterations. If the loops iterations all really run in parallel, the same way as parallel code on block diagram runs, the application would not hang on a dead-lock. However, if there is no complete parallelism, the application would hang.
So what is the result of our test. Well, this small test application hangs if you set the number of iterations high. The hanging threshold seems to depend on the development time set number of parallel loop instances. The runtime defined number of workers does not alone define the parallelism. The result indicates that there is a difference between copying code to multiple parallel instances on block diagram and relying on for-loop parallelism. If the code in your loop depend on shared data such as queues, data-value instances or notifiers, be aware of the dead-lock possibility. Also when you are using somebody elses code in your parallel for loop, think carefully if the dead-lock possibility exists.
EDIT Sep 17, 2009 Mary Fletcher from NI R&D explained the implementation of the parallel for-loops in more detail in the comments of this post. The number of loop instances specified in the loop configuration dialog is the maximum number of workers that could work on parallel for executing the loop iterations. The actual number of workers is specified at runtime to be the value of P terminal, if it is smaller than the maximum number specified in the configuration dialog. If P is greater than the maximum number, then the maximum number of workers is used instead. If P is not connected, LabVIEW uses as many parallel workers as there are logical processors in the machine, however never exceeding the maximum number of workers specified at runtime. If there is a parallel loop within another parallel loop, only the outer parallel loop will be parallelized. This will change in LabVIEW 2009 SP 1 where LabVIEW will parallelize both loops resulting in P*P’ workers for the inner loop. This limitation of parallel loop within another parallel loop does not apply to subVI calls within parallel loop subVIs having parallel code themselves. If the number of workers specified at configuration time and at runtime both are equal or grater than the number of iterations, all the loop iterations will then execute truly in parallel and you can safely use design patterns such as producer-consumer pattern between loop iterations. Thank you Mary for this valuable information.
10 CommentsMake A Comment
Leave a comment
You must be logged in to post a comment.