<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Limits of for-loop parallelism, how parallel they really execute?</title>
	<atom:link href="http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/feed/" rel="self" type="application/rss+xml" />
	<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/</link>
	<description>LabVIEW and visual programming blog</description>
	<lastBuildDate>Wed, 07 Jul 2010 22:34:08 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<item>
		<title>By: Parallel For loop &#124; ByteLABS</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6990</link>
		<dc:creator>Parallel For loop &#124; ByteLABS</dc:creator>
		<pubDate>Sat, 09 Jan 2010 14:12:17 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6990</guid>
		<description>[...] articolo su expression flow, al proposito http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute....    This entry was posted by admin on January 9, 2010 at 3:11 pm, and is filled under LabVIEW. [...]</description>
		<content:encoded><![CDATA[<p>[...] articolo su expression flow, al proposito http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute&#8230;.    This entry was posted by admin on January 9, 2010 at 3:11 pm, and is filled under LabVIEW. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: AristosQueue</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6948</link>
		<dc:creator>AristosQueue</dc:creator>
		<pubDate>Thu, 17 Sep 2009 02:41:27 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6948</guid>
		<description>Tomi, in answer to your question, yes, Mary is part of LV R&amp;D, and she is a primary developer of the parallel for loop, and one of our resident experts in parallel architectures.</description>
		<content:encoded><![CDATA[<p>Tomi, in answer to your question, yes, Mary is part of LV R&amp;D, and she is a primary developer of the parallel for loop, and one of our resident experts in parallel architectures.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MaryFletcher</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6947</link>
		<dc:creator>MaryFletcher</dc:creator>
		<pubDate>Wed, 16 Sep 2009 20:34:16 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6947</guid>
		<description>Yep, the multiplication rule applies to code in SubVIs called from parallel for loops, even without the service pack. Thanks for the recursive parallelism test case idea.</description>
		<content:encoded><![CDATA[<p>Yep, the multiplication rule applies to code in SubVIs called from parallel for loops, even without the service pack. Thanks for the recursive parallelism test case idea.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tomi Maila</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6946</link>
		<dc:creator>Tomi Maila</dc:creator>
		<pubDate>Wed, 16 Sep 2009 19:29:11 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6946</guid>
		<description>I assume the P*P&#039; worker multiplication rule of LV 2009 SP 1 applies also to subVI calls from within a parallel loop if the subVI contains another parallel loop. And in case of recursive VI call, we could get a whole army of workers :) Indeed, this gives me a nice idea. With a recursive VI call, we can at runtime specify the number of workers. We simply call the VI itself recursively from within the loop until the required number of workers have been reached. Maybe Mary you should add this as a test case for LabVIEW 2009 SP 1: &quot;Generating arbitrary number of workers with recursive SubVI call&quot;</description>
		<content:encoded><![CDATA[<p>I assume the P*P&#8217; worker multiplication rule of LV 2009 SP 1 applies also to subVI calls from within a parallel loop if the subVI contains another parallel loop. And in case of recursive VI call, we could get a whole army of workers <img src='http://expressionflow.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Indeed, this gives me a nice idea. With a recursive VI call, we can at runtime specify the number of workers. We simply call the VI itself recursively from within the loop until the required number of workers have been reached. Maybe Mary you should add this as a test case for LabVIEW 2009 SP 1: &#8220;Generating arbitrary number of workers with recursive SubVI call&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MaryFletcher</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6945</link>
		<dc:creator>MaryFletcher</dc:creator>
		<pubDate>Wed, 16 Sep 2009 18:34:03 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6945</guid>
		<description>In 2009 SP1, if you put a parallel loop with P workers inside a parallel loop with P&#039; workers, you would get P*P&#039; parallel instances solving the problem. (Without the service pack, only the outer loop will execute in parallel.)</description>
		<content:encoded><![CDATA[<p>In 2009 SP1, if you put a parallel loop with P workers inside a parallel loop with P&#8217; workers, you would get P*P&#8217; parallel instances solving the problem. (Without the service pack, only the outer loop will execute in parallel.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tomi Maila</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6944</link>
		<dc:creator>Tomi Maila</dc:creator>
		<pubDate>Wed, 16 Sep 2009 18:08:15 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6944</guid>
		<description>Yes I am asking what happens when the loop contains code that would also need to run in parallel? Say if I would have N workers working on a loop with N iterations and each iteration would consists of two enequeue-dequeue pairs instead of one as in my present example code, would we have a dead-lock? Well, I tested it and no dead-lock occurs which confirms true parallelism. 

What if I have a parallel loop inside a parallel loop inside a parallel loop? Outer loop has N1 workers and inner loop has N2 workers. How many parallel instances are actually solving the problem of the inner loop? N1*N2?</description>
		<content:encoded><![CDATA[<p>Yes I am asking what happens when the loop contains code that would also need to run in parallel? Say if I would have N workers working on a loop with N iterations and each iteration would consists of two enequeue-dequeue pairs instead of one as in my present example code, would we have a dead-lock? Well, I tested it and no dead-lock occurs which confirms true parallelism. </p>
<p>What if I have a parallel loop inside a parallel loop inside a parallel loop? Outer loop has N1 workers and inner loop has N2 workers. How many parallel instances are actually solving the problem of the inner loop? N1*N2?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MaryFletcher</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6943</link>
		<dc:creator>MaryFletcher</dc:creator>
		<pubDate>Wed, 16 Sep 2009 17:41:30 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6943</guid>
		<description>Sorry for the terminology confusion. Loop &quot;instances&quot; and &quot;workers&quot; are the same thing.

What you restated is correct, except that when you don&#039;t wire anything to [P], we actually try to use as many workers as there are logical processors (cores) on your machine.

Are you asking what happens when the loop contains code that could also run in parallel? Parallelism in code the loop body isn&#039;t restrained by the number of workers you are using for the loop. Your computer may get overwhelmed by too much parallelism though.</description>
		<content:encoded><![CDATA[<p>Sorry for the terminology confusion. Loop &#8220;instances&#8221; and &#8220;workers&#8221; are the same thing.</p>
<p>What you restated is correct, except that when you don&#8217;t wire anything to [P], we actually try to use as many workers as there are logical processors (cores) on your machine.</p>
<p>Are you asking what happens when the loop contains code that could also run in parallel? Parallelism in code the loop body isn&#8217;t restrained by the number of workers you are using for the loop. Your computer may get overwhelmed by too much parallelism though.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tomi Maila</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6942</link>
		<dc:creator>Tomi Maila</dc:creator>
		<pubDate>Wed, 16 Sep 2009 17:23:11 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6942</guid>
		<description>Mary, I assume from your inside sounding information that you are working for NI R&amp;D. Thanks for clarifying how the parallel for loops function behind the scenes. I tried to search the information from LabVIEW help, but couldn&#039;t, so I ended up testing the functionality. 

If I understood correctly, LabVIEW generates the development time specified number of loop instances. The loop iterations are then divided to these loop instances by the runtime scheduler. The scheduler uses either all the parallel loop instances or the number of instances corresponding to the value of P terminal, which ever is smaller. If P is not connected, all loop instances are used. If number of loop instances is larger than the number of iterations, all iterations are executed truly in parallel.

What happens if there are parallel items within a single iteration. Are these also executed truly in parallel, even if the number of iterations is equal to the number of loop instances.

I am also a little confused of the terminology. On the dialog window you specify something called loop instaces but with P terminal you specify something called workers. Is there a documentation that clarifies the differences between workers and loop instances.</description>
		<content:encoded><![CDATA[<p>Mary, I assume from your inside sounding information that you are working for NI R&#038;D. Thanks for clarifying how the parallel for loops function behind the scenes. I tried to search the information from LabVIEW help, but couldn&#8217;t, so I ended up testing the functionality. </p>
<p>If I understood correctly, LabVIEW generates the development time specified number of loop instances. The loop iterations are then divided to these loop instances by the runtime scheduler. The scheduler uses either all the parallel loop instances or the number of instances corresponding to the value of P terminal, which ever is smaller. If P is not connected, all loop instances are used. If number of loop instances is larger than the number of iterations, all iterations are executed truly in parallel.</p>
<p>What happens if there are parallel items within a single iteration. Are these also executed truly in parallel, even if the number of iterations is equal to the number of loop instances.</p>
<p>I am also a little confused of the terminology. On the dialog window you specify something called loop instaces but with P terminal you specify something called workers. Is there a documentation that clarifies the differences between workers and loop instances.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MaryFletcher</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6941</link>
		<dc:creator>MaryFletcher</dc:creator>
		<pubDate>Wed, 16 Sep 2009 17:08:30 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6941</guid>
		<description>By the way, if you want to easily find for loops that can be made parallel, you can use the &quot;parallel for loop detector&quot; in Tools&gt;&gt;Profile&gt;&gt;Find Parallelizable Loops. LabVIEW doesn&#039;t automatically parallelize all of your for loops for you, since there can be a slight performance penalty on small loops.</description>
		<content:encoded><![CDATA[<p>By the way, if you want to easily find for loops that can be made parallel, you can use the &#8220;parallel for loop detector&#8221; in Tools&gt;&gt;Profile&gt;&gt;Find Parallelizable Loops. LabVIEW doesn&#8217;t automatically parallelize all of your for loops for you, since there can be a slight performance penalty on small loops.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: MaryFletcher</title>
		<link>http://expressionflow.com/2009/09/15/limits-of-for-loop-parallelism-how-parallel-they-really-execute/comment-page-1/#comment-6940</link>
		<dc:creator>MaryFletcher</dc:creator>
		<pubDate>Wed, 16 Sep 2009 16:52:13 +0000</pubDate>
		<guid isPermaLink="false">http://expressionflow.com/?p=253#comment-6940</guid>
		<description>Interesting example on the potential dangers of using these types of objects in parallel. In this piece of code, a deadlock can occur when [N] is greater than the number of generated instances in the configuration dialog box (10 in your diagram). This causes some of the worker loops to execute more than one iteration sequentially. When [N] is 20, a worker from the top loop will operate on queues at indices 9 and 10 in that order, while a worker from the bottom loop will operate on queues 10 and 9 in that order. The same deadlock happens if you disable parallelism on both loops.

The number you enter in the configuration dialog is the maximum number of loop instances, and the number you wire to [P] is the number of those that you want to use. The dialog box number is a cap.

LabVIEW warns you about using queues, local variables, etc. in parallel for loops, assuming you have warnings enabled, but it doesn&#039;t stop you. You can do some nifty things with these types of objects (see the example Parallel For Loop Iteration Order.vi), so we don&#039;t forbid it.</description>
		<content:encoded><![CDATA[<p>Interesting example on the potential dangers of using these types of objects in parallel. In this piece of code, a deadlock can occur when [N] is greater than the number of generated instances in the configuration dialog box (10 in your diagram). This causes some of the worker loops to execute more than one iteration sequentially. When [N] is 20, a worker from the top loop will operate on queues at indices 9 and 10 in that order, while a worker from the bottom loop will operate on queues 10 and 9 in that order. The same deadlock happens if you disable parallelism on both loops.</p>
<p>The number you enter in the configuration dialog is the maximum number of loop instances, and the number you wire to [P] is the number of those that you want to use. The dialog box number is a cap.</p>
<p>LabVIEW warns you about using queues, local variables, etc. in parallel for loops, assuming you have warnings enabled, but it doesn&#8217;t stop you. You can do some nifty things with these types of objects (see the example Parallel For Loop Iteration Order.vi), so we don&#8217;t forbid it.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
