|
|
The time integration scheme is second-order Runge-Kutta for the velocity and pressure equations, and forward-in-time for the scalar fields (such as temperature and moisture). The model uses a split integration approach to increase computational efficiency when both slow and fast modes are present in the system, i.e., sound waves are integrated using a small time step, while slower mode processes such as advection and mixing are integrated using a more economical larger time step.
The spatial advection scheme for momemtum and pressure is third order upwind. Scalar variables are transported using a second-order fully three-dimensional eulerian monotonic advection scheme.
There can be up to 7 scalar variables, depending on the representation of the microphysical species. Both liquid and ice microphysical variables can be included in the model solution.
3-D arrays are defined as either (nx,ny,nz)/ijk or (nz,ny,nx)/kji style. All runs were made in dedicated mode. Time was obtained from the fortran call "etime" which measures the sum of elapsed user and system time since the last etime call.
Compiler flags were used to automatically parallelize the code.
Model parameters : time step: 6.0 s, # of small time steps: 4, # of large time steps: 50
| threads | total cpu time (sec) | total speedup | solver cpu time | solver speedup |
|---|---|---|---|---|
| 1 | 962.08600 | 1.0000 | 943.44537 | 1.0000 |
| 2 | 519.78723 | 1.8509 | 506.98990 | 1.8609 |
| 4 | 298.65497 | 3.2214 | 289.60947 | 3.2576 |
| 8 | 192.94649 | 4.9863 | 185.50102 | 5.0859 |
| 16 | 128.81763 | 7.4686 | 123.13338 | 7.6620 |
| 32 | 177.89604 | 5.4081 | 169.95786 | 5.5511 |
| threads | total cpu time (sec) | total speedup | solver cpu time | solver speedup |
|---|---|---|---|---|
| 1 | 1009.45972 | 1.0000 | 1018.06079 | 1.0000 |
| 2 | 538.53094 | 1.8745 | 542.05200 | 1.8782 |
| 4 | 269.71603 | 3.7427 | 269.42114 | 3.7787 |
| 8 | 143.06299 | 7.0561 | 141.38454 | 7.2006 |
| 16 | 81.22086 | 12.4286 | 78.86033 | 12.9097 |
| 32 | 59.14380 | 17.0679 | 56.31403 | 18.0783 |
| threads | total cpu time (sec) | total speedup | solver cpu time | solver speedup |
|---|---|---|---|---|
| 1 | 2624.50635 | 1.0000 | 2552.86133 | 1.0000 |
| 2 | 1425.517941 | 1.8411 | 1374.04224 | 1.8579 |
| 4 | 834.33459 | 3.1456 | 793.83734 | 3.2158 |
| 8 | 482.77496 | 5.4363 | 453.13104 | 5.6338 |
| 16 | 338.14569 | 7.7615 | 313.10388 | 8.1534 |
| 32 | 277.27332 | 9.4654 | 253.26633 | 10.0798 |
| threads | total cpu time (sec) | total speedup | solver cpu time | solver speedup |
|---|---|---|---|---|
| 1 | 2635.51025 | 1.0000 | 2638.30884 | 1.0000 |
| 2 | 1425.13977 | 1.8493 | 1420.25928 | 1.8576 |
| 4 | 733.49231 | 3.5931 | 724.62524 | 3.6409 |
| 8 | 408.04532 | 6.4589 | 396.43494 | 6.6551 |
| 16 | 220.17799 | 11.9699 | 205.93039 | 12.8116 |
| 32 | 141.09566 | 18.6789 | 123.79723 | 21.3115 |
| threads | total cpu time (sec) | total speedup | solver cpu time | solver speedup |
|---|---|---|---|---|
| 1 | 7324.24805 | 1.0000 | 7097.02930 | 1.0000 |
| 2 | 4034.58472 | 1.8154 | 3875.48022 | 1.8313 |
| 4 | 2168.64941 | 3.3773 | 2044.30798 | 3.4716 |
| 8 | 1245.31323 | 5.8815 | 1150.43250 | 6.1690 |
| 16 | 841.72296 | 8.7015 | 746.78992 | 9.5034 |
| 32 | 645.82166 | 11.3410 | 566.58411 | 12.5260 |
| threads | total cpu time (sec) | total speedup | solver cpu time | solver speedup |
|---|---|---|---|---|
| 1 | 7422.19287 | 1.0000 | 7406.74805 | 1.0000 |
| 2 | 3936.79468 | 1.8853 | 3899.90625 | 1.8992 |
| 4 | 2076.00610 | 3.5752 | 2028.36841 | 3.6516 |
| 8 | 1235.76123 | 6.0062 | 1180.80627 | 6.2726 |
| 16 | 629.10284 | 11.7981 | 565.89575 | 13.0885 |
| 32 | 423.07086 | 17.5436 | 353.76389 | 20.9370 |
| threads | total cpu time (sec) | total speedup | solver cpu time | solver speedup |
|---|---|---|---|---|
| 1 | 8190.19873 | 1.0000 | 7950.73193 | 1.0000 |
| 2 | 4344.58740 | 1.8851 | 4174.16455 | 1.9047 |
| 4 | 2512.65356 | 3.2596 | 2379.09399 | 3.3419 |
| 8 | 1458.82080 | 5.6143 | 1328.69141 | 5.9839 |
| 16 | 879.32568 | 9.3142 | 782.48157 | 10.1609 |
| 32 | 685.76770 | 11.9431 | 599.96045 | 13.2521 |
| threads | total cpu time (sec) | total speedup | solver cpu time | solver speedup |
|---|---|---|---|---|
| 1 | 8092.06250 | 1.0000 | 8081.73486 | 1.0000 |
| 2 | 4431.94580 | 1.8258 | 4394.77490 | 1.8389 |
| 4 | 2270.64404 | 3.5638 | 2220.60474 | 3.6394 |
| 8 | 1321.80176 | 6.1220 | 1265.08594 | 6.3883 |
| 16 | 673.09802 | 12.0221 | 613.44598 | 13.1743 |
| 31 | 412.3430 | 19.6246 | 351.9050 | 22.9656 |
| 32 | 406.72824 | 19.8955 | 347.37543 | 23.2651 |
| 48 | 319.05573 | 25.3625 | 256.69330 | 31.4840 |
| 62 | 283.65887 | 28.5274 | 217.04993 | 37.2344 |
| 63 | 280.34961 | 28.8642 | 212.94124 | 37.9529 |
|
|