Differences
This shows you the differences between two versions of the page.
analytical_modeling_of_system_temperature:index [2009/05/07 18:47] anshulg |
analytical_modeling_of_system_temperature:index [2010/08/17 18:27] (current) |
||
---|---|---|---|
Line 749: | Line 749: | ||
===Linked-list Cache Miss Test=== | ===Linked-list Cache Miss Test=== | ||
+ | For a linked-list test, every cache miss depends on the previous cache miss. Thus, there is limited runahead. | ||
== cache miss with nops == | == cache miss with nops == | ||
We create a large linked-list and traverse through it (while loop) in an essentially pointer chasing manner. While the miss is being serviced, the machine executes nops (added in the assembly code before the end of each loop). We increase the number of nops in the loop to decrease the cache miss rate (more nops in loop implies more time between requests to memory for data). Thus, lower the cache miss rate, higher the number of nops executed per loop and thus, lower the peak temperature.\\ | We create a large linked-list and traverse through it (while loop) in an essentially pointer chasing manner. While the miss is being serviced, the machine executes nops (added in the assembly code before the end of each loop). We increase the number of nops in the loop to decrease the cache miss rate (more nops in loop implies more time between requests to memory for data). Thus, lower the cache miss rate, higher the number of nops executed per loop and thus, lower the peak temperature.\\ | ||
Line 900: | Line 900: | ||
804851c: 8d 61 fc lea -0x4(%ecx),%esp | 804851c: 8d 61 fc lea -0x4(%ecx),%esp | ||
804851f: c3 ret | 804851f: c3 ret | ||
+ | </code> | ||
+ | |||
+ | ===Linked-list Branch Misprediction Test=== | ||
+ | We wrap a simple branching condition around the linked-list test. However, even though we may mispredict the branch and go ahead and fetch a cache miss, it might be useful later on. Thus, a mispredicted branch is not totally useless.\\ | ||
+ | |||
+ | We use RAND_MAX-1 and RANDMAX/2 as our comparators against the random number generated. This creates different branch misprediction rates.\\ | ||
+ | |||
+ | Results indicate that higher the branch misprediction rate, higher is the cpu peak temperature. We believe this to be due to the work done in flushing the pipelines. | ||
+ | {{:analytical_modeling_of_system_temperature:mispred.png?1000}} | ||
+ | <code> | ||
+ | 08048424 <main>: | ||
+ | 8048424: 8d 4c 24 04 lea 0x4(%esp),%ecx | ||
+ | 8048428: 83 e4 f0 and $0xfffffff0,%esp | ||
+ | 804842b: ff 71 fc pushl -0x4(%ecx) | ||
+ | 804842e: 55 push %ebp | ||
+ | 804842f: 89 e5 mov %esp,%ebp | ||
+ | 8048431: 51 push %ecx | ||
+ | 8048432: 83 ec 44 sub $0x44,%esp | ||
+ | 8048435: c7 44 24 04 04 00 00 movl $0x4,0x4(%esp) | ||
+ | 804843c: 00 | ||
+ | 804843d: c7 04 24 a5 85 04 08 movl $0x80485a5,(%esp) | ||
+ | 8048444: e8 07 ff ff ff call 8048350 <printf@plt> | ||
+ | 8048449: c7 44 24 04 40 00 00 movl $0x40,0x4(%esp) | ||
+ | 8048450: 00 | ||
+ | 8048451: c7 04 24 01 00 00 00 movl $0x1,(%esp) | ||
+ | 8048458: e8 d3 fe ff ff call 8048330 <calloc@plt> | ||
+ | 804845d: 89 45 f0 mov %eax,-0x10(%ebp) | ||
+ | 8048460: 8b 45 f0 mov -0x10(%ebp),%eax | ||
+ | 8048463: 89 45 ec mov %eax,-0x14(%ebp) | ||
+ | 8048466: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%ebp) | ||
+ | 804846d: eb 34 jmp 80484a3 <main+0x7f> | ||
+ | 804846f: c7 44 24 04 40 00 00 movl $0x40,0x4(%esp) | ||
+ | 8048476: 00 | ||
+ | 8048477: c7 04 24 01 00 00 00 movl $0x1,(%esp) | ||
+ | 804847e: e8 ad fe ff ff call 8048330 <calloc@plt> | ||
+ | 8048483: 89 45 e8 mov %eax,-0x18(%ebp) | ||
+ | 8048486: 8b 45 e8 mov -0x18(%ebp),%eax | ||
+ | 8048489: c7 40 3c 00 00 00 00 movl $0x0,0x3c(%eax) | ||
+ | 8048490: 8b 55 ec mov -0x14(%ebp),%edx | ||
+ | 8048493: 8b 45 e8 mov -0x18(%ebp),%eax | ||
+ | 8048496: 89 42 3c mov %eax,0x3c(%edx) | ||
+ | 8048499: 8b 45 e8 mov -0x18(%ebp),%eax | ||
+ | 804849c: 89 45 ec mov %eax,-0x14(%ebp) | ||
+ | 804849f: 83 45 f4 01 addl $0x1,-0xc(%ebp) | ||
+ | 80484a3: 81 7d f4 7f 96 98 00 cmpl $0x98967f,-0xc(%ebp) | ||
+ | 80484aa: 7e c3 jle 804846f <main+0x4b> | ||
+ | 80484ac: 8b 45 ec mov -0x14(%ebp),%eax | ||
+ | 80484af: 8b 55 f0 mov -0x10(%ebp),%edx | ||
+ | 80484b2: 89 50 3c mov %edx,0x3c(%eax) | ||
+ | 80484b5: 8b 45 f0 mov -0x10(%ebp),%eax | ||
+ | 80484b8: 89 45 ec mov %eax,-0x14(%ebp) | ||
+ | 80484bb: e8 a0 fe ff ff call 8048360 <rand@plt> | ||
+ | 80484c0: 89 45 f4 mov %eax,-0xc(%ebp) | ||
+ | 80484c3: 81 7d f4 fe ff ff 3f cmpl $0x3ffffffe,-0xc(%ebp) // This is RAND_MAX/2 used for the red plot. | ||
+ | 80484ca: 7f ef jg 80484bb <main+0x97> // We use RAND_MAX-1 for the blue plot. | ||
+ | 80484cc: 8b 45 ec mov -0x14(%ebp),%eax | ||
+ | 80484cf: 8b 40 3c mov 0x3c(%eax),%eax | ||
+ | 80484d2: 89 45 ec mov %eax,-0x14(%ebp) | ||
+ | 80484d5: eb e4 jmp 80484bb <main+0x97> | ||
+ | 80484d7: 90 nop | ||
+ | 80484d8: 90 nop | ||
+ | 80484d9: 90 nop | ||
+ | 80484da: 90 nop | ||
+ | 80484db: 90 nop | ||
+ | 80484dc: 90 nop | ||
+ | 80484dd: 90 nop | ||
+ | 80484de: 90 nop | ||
+ | 80484df: 90 nop | ||
+ | </code> | ||
+ | |||
+ | ===Array element-fetching Cache Miss Test=== | ||
+ | We create a huge array and load elements from it. A cache miss does not depend on the previous cache miss as was in the case of linked-list test. | ||
+ | Again, we add nops to reduce the cache miss rate. Thus, lower the cache miss rate, lower is the peak temperature. \\ | ||
+ | {{:analytical_modeling_of_system_temperature:arraymiss2.png?1000}} | ||
+ | <code> | ||
+ | 080483f4 <main>: | ||
+ | 80483f4: 8d 4c 24 04 lea 0x4(%esp),%ecx | ||
+ | 80483f8: 83 e4 f0 and $0xfffffff0,%esp | ||
+ | 80483fb: ff 71 fc pushl -0x4(%ecx) | ||
+ | 80483fe: 55 push %ebp | ||
+ | 80483ff: 89 e5 mov %esp,%ebp | ||
+ | 8048401: 51 push %ecx | ||
+ | 8048402: 83 ec 24 sub $0x24,%esp | ||
+ | 8048405: c7 44 24 04 04 00 00 movl $0x4,0x4(%esp) | ||
+ | 804840c: 00 | ||
+ | 804840d: c7 04 24 25 87 04 08 movl $0x8048725,(%esp) | ||
+ | 8048414: e8 13 ff ff ff call 804832c <printf@plt> | ||
+ | 8048419: c7 44 24 04 04 00 00 movl $0x4,0x4(%esp) | ||
+ | 8048420: 00 | ||
+ | 8048421: c7 04 24 00 00 90 01 movl $0x1900000,(%esp) | ||
+ | 8048428: e8 df fe ff ff call 804830c <calloc@plt> | ||
+ | 804842d: 89 45 f8 mov %eax,-0x8(%ebp) | ||
+ | 8048430: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%ebp) | ||
+ | 8048437: eb 13 jmp 804844c <main+0x58> | ||
+ | 8048439: 8b 45 f4 mov -0xc(%ebp),%eax | ||
+ | 804843c: c1 e0 02 shl $0x2,%eax | ||
+ | 804843f: 03 45 f8 add -0x8(%ebp),%eax | ||
+ | 8048442: c7 00 10 00 00 00 movl $0x10,(%eax) | ||
+ | 8048448: 83 45 f4 01 addl $0x1,-0xc(%ebp) | ||
+ | 804844c: 81 7d f4 ff ff 8f 01 cmpl $0x18fffff,-0xc(%ebp) | ||
+ | 8048453: 7e e4 jle 8048439 <main+0x45> | ||
+ | 8048455: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%ebp) | ||
+ | 804845c: e9 e8 01 00 00 jmp 8048649 <main+0x255> | ||
+ | 8048461: 8b 45 f4 mov -0xc(%ebp),%eax | ||
+ | 8048464: c1 e0 02 shl $0x2,%eax | ||
+ | 8048467: 03 45 f8 add -0x8(%ebp),%eax | ||
+ | 804846a: 8b 00 mov (%eax),%eax | ||
+ | 804846c: 83 45 f4 10 addl $0x10,-0xc(%ebp) | ||
+ | 8048470: 90 nop | ||
+ | 8048471: 90 nop | ||
+ | 8048472: 90 nop | ||
+ | |||
+ | \\ lots of nops to vary cache miss rate | ||
+ | |||
+ | 8048645: 90 nop | ||
+ | 8048646: 90 nop | ||
+ | 8048647: 90 nop | ||
+ | 8048648: 90 nop | ||
+ | 8048649: 81 7d f4 ff ff 8f 01 cmpl $0x18fffff,-0xc(%ebp) | ||
+ | 8048650: 0f 8e 0b fe ff ff jle 8048461 <main+0x6d> | ||
+ | 8048656: e9 fa fd ff ff jmp 8048455 <main+0x61> | ||
+ | 804865b: 90 nop | ||
+ | 804865c: 90 nop | ||
+ | 804865d: 90 nop | ||
+ | 804865e: 90 nop | ||
+ | 804865f: 90 nop | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ===Array element-fetching Branch Misprediction Test=== | ||
+ | We wrap a branch condition around the array cache miss code. In the 'if' part, we fetch the next element, in the 'else' part, we fetch the ((array_size/2)+1)st element. Since the array is large, a branch misprediction fetches useless data. Thus, we'll have to flush often (and thus stall more) for a higher branch misprediction rate. Reults indicate that higher the branch misprediction rate, lower is the peak temperature. Though this seems normal, the linked-list branch misprediction test suggested the opposite (not sure why).\\ | ||
+ | {{:analytical_modeling_of_system_temperature:arraymisp.png?1000}} | ||
+ | <code> | ||
+ | 08048454 <main>: | ||
+ | 8048454: 8d 4c 24 04 lea 0x4(%esp),%ecx | ||
+ | 8048458: 83 e4 f0 and $0xfffffff0,%esp | ||
+ | 804845b: ff 71 fc pushl -0x4(%ecx) | ||
+ | 804845e: 55 push %ebp | ||
+ | 804845f: 89 e5 mov %esp,%ebp | ||
+ | 8048461: 51 push %ecx | ||
+ | 8048462: 83 ec 34 sub $0x34,%esp | ||
+ | 8048465: c7 45 ec fd 5c 1e 6e movl $0x6e1e5cfd,-0x14(%ebp) | ||
+ | 804846c: c7 44 24 04 04 00 00 movl $0x4,0x4(%esp) | ||
+ | 8048473: 00 | ||
+ | 8048474: c7 04 24 00 00 90 01 movl $0x1900000,(%esp) | ||
+ | 804847b: e8 dc fe ff ff call 804835c <calloc@plt> | ||
+ | 8048480: 89 45 e8 mov %eax,-0x18(%ebp) | ||
+ | 8048483: c7 04 24 00 00 00 00 movl $0x0,(%esp) | ||
+ | 804848a: e8 ed fe ff ff call 804837c <time@plt> | ||
+ | 804848f: 89 04 24 mov %eax,(%esp) | ||
+ | 8048492: e8 a5 fe ff ff call 804833c <srand@plt> | ||
+ | 8048497: c7 45 f0 ef 03 00 00 movl $0x3ef,-0x10(%ebp) | ||
+ | 804849e: c7 45 f8 00 00 00 00 movl $0x0,-0x8(%ebp) | ||
+ | 80484a5: eb 31 jmp 80484d8 <main+0x84> | ||
+ | 80484a7: e8 e0 fe ff ff call 804838c <rand@plt> | ||
+ | 80484ac: 89 45 f4 mov %eax,-0xc(%ebp) | ||
+ | 80484af: 8b 45 f4 mov -0xc(%ebp),%eax | ||
+ | 80484b2: 3b 45 ec cmp -0x14(%ebp),%eax | ||
+ | 80484b5: 7d 0d jge 80484c4 <main+0x70> | ||
+ | 80484b7: 8b 45 f8 mov -0x8(%ebp),%eax | ||
+ | 80484ba: c1 e0 02 shl $0x2,%eax | ||
+ | 80484bd: 03 45 e8 add -0x18(%ebp),%eax | ||
+ | 80484c0: 8b 00 mov (%eax),%eax | ||
+ | 80484c2: eb 10 jmp 80484d4 <main+0x80> | ||
+ | 80484c4: 8b 45 f8 mov -0x8(%ebp),%eax | ||
+ | 80484c7: 05 00 00 c8 00 add $0xc80000,%eax | ||
+ | 80484cc: c1 e0 02 shl $0x2,%eax | ||
+ | 80484cf: 03 45 e8 add -0x18(%ebp),%eax | ||
+ | 80484d2: 8b 00 mov (%eax),%eax | ||
+ | 80484d4: 83 45 f8 10 addl $0x10,-0x8(%ebp) | ||
+ | 80484d8: 81 7d f8 ff ff c7 00 cmpl $0xc7ffff,-0x8(%ebp) | ||
+ | 80484df: 7e c6 jle 80484a7 <main+0x53> | ||
+ | 80484e1: eb bb jmp 804849e <main+0x4a> | ||
+ | 80484e3: 90 nop | ||
+ | 80484e4: 90 nop | ||
+ | 80484e5: 90 nop | ||
+ | 80484e6: 90 nop | ||
+ | 80484e7: 90 nop | ||
+ | 80484e8: 90 nop | ||
+ | 80484e9: 90 nop | ||
+ | 80484ea: 90 nop | ||
+ | 80484eb: 90 nop | ||
+ | 80484ec: 90 nop | ||
+ | 80484ed: 90 nop | ||
+ | 80484ee: 90 nop | ||
+ | 80484ef: 90 nop | ||
</code> | </code> |