Problem Description

I am going to buy a new dedicated computer for running COMSOL Multiphysics®. What hardware do you recommend?

Solution

Due to the wide range of different problem types that COMSOL Multiphysics® solves, the rapid pace of software and hardware development, and the variety of different hardware at significantly different price points, there is no single optimal choice of computer for all usage cases.

Memory

The single most important factor is that you have enough physical memory (RAM) to solve the largest models that you want to work with, and that the RAM is correctly installed. If you do not have enough RAM, then there will be significant slowdown, regardless of all other hardware choices.

Predicting RAM requirements is done by solving similar, but smaller, models that contain the same physics that you want to solve in your largest models. Monitor the memory used and the degrees of freedom, which are reported in the Solver Log. Fit a curve to this data of the form A x (dof)^N, where A and N are fitting coefficients and dof is the number of degrees of freedom, and use this to predict the memory requirements for your larger models. The exponent N will usually be between 1 and 2, and depends most strongly on the type of of linear system solver being used. The factor, A, depends most strongly on the type of physics, and the combination of physics, being solved, but can also depend upon specific features within the model. Be aware that memory usage versus degrees of freedom can be very different between different model types, so you may need to repeat this procedure for every type of model that you wish to solve.

You will need a computer with at least your estimated maximum amount of RAM. Also be aware that there is no advantage to having significantly more RAM than is actually needed. Make sure to use the fastest possible memory speed supported by the CPU that you choose.

Performance is also strongly dependent on how the memory is installed. Computers access the installed memory via a multichannel memory bus. The memory speed will be clocked down if the memory is not correctly populated. For example, consider a four memory-channel single-CPU computer, with two slots per channel, for a total of 8 open DIMM slots, as shown in the schematic below. Assuming that you wish to install 64 GB of RAM, there are several ways in which this could be done.

Schematic of a Computer

For this computer, the optimal approach in terms of computer performance is to fill all eight slots with one 8 GB DIMM. However, this has the drawback that no expansion is possible, if you need to upgrade RAM, you will need to buy all new memory. The nearly optimal approach is to put one 16 GB DIMM per memory channel. The performance may be slightly lower that the optimal configuration but the installed memory can be doubled by purchasing four more 16GB DIMMs, so this configuration is the best overall. Purely in terms of expandability, the best option is instead to install a single 64 GB DIMM, as this leaves the most empty slots. However, performance in this case can be around four times slower, especially for large-memory models. This configuration is only recommended if expandability is the primary concern. Other configurations have no advantages. This is summarized in the schematic below. It is also good to check with your hardware vendor about optimal memory installation.

Recommended Memory Layouts

Other Factors Affecting Overall Software Speed

There is a complicated relationship between performance, CPU type, CPU base frequency, cache, number of CPUs, number of cores per CPU, and hardware cost. The COMSOL codebase is composed of several different classes of algorithms, and these algorithms have different scaling properties. Therefore, some hardware factors will weigh more heavily on performance than others, and the relative merits of these factors is both problem-type and problem-size dependent. It is thus very difficult to make specific hardware recommendations. The following are general recommendations.

CPU Type

Different CPU architectures offer different sets of features, at significantly different prices.

High-end CPUs, such as the Intel® Xeon® Gold and Platinum, or AMD® EPYC®, processors have CPU-to-CPU interconnects that enable multiple CPUs per computer, and allow the CPUs to communicate with each other to access very large amounts of memory. These processors have the highest memory bandwidth; the ability to quickly move a lot of data back and forth between RAM memory and the processor. That is their primary advantage when running COMSOL. High-end CPUs should be used in dual-CPU, or even four-CPU or eight-CPU, configurations. This is motivated if you need to address very large amounts of memory, or are planning to continuously run many simulations in parallel. When solving a single model, performance will improve with increasing number of CPUs but the relative performance improvement is dependent on model size. Larger models will see greater speedup on multi-CPU systems. If you are considering purchasing a four- or eight-CPU system, please contact COMSOL Technical Support.

Mid-range CPUs, such as the Intel® Xeon® W, or AMD® Ryzen™ Threadripper™, processors, do not have CPU-to-CPU interconnects and are thus an appropriate choice for a single-CPU computer. They do have comparable clock speeds and core counts as high-end systems. They are an attractive all-around choice.

Consumer Grade CPUs, such as Intel® Core™ processors, can have very good, or even better, performance especially when solving a smaller-memory model.

Clock Frequency

Higher clock frequency will generally lead to faster performance of the software in all areas. If all other hardware specifications are the the same, the relative performance between two computers will be most directly dependent on clock frequency.

Cache Memory

Cache memory is built directly into the processor. Higher cache is better. All other factors being equal, a higher cache machine will show better performance.

Number of Cores

The more cores in the processor, the more parallel threads can be executed at once, this is known as multithreading. COMSOL will automatically take advantage of all available cores, but there is a computational cost to this. Using too many cores in parallel may even lead to a slowdown, although usually only for relatively small models. Some models are even dominated by their single-thread performance. In general, six- or eight-core systems are a good all-around choice, but more cores than that can be better, especially when running multiple models in parallel, or when using the PARDISO direct solver.


General Recommendations

Parametric Sweeps

If you plan to solve for many geometric variations, different meshes, different sets of materials, or other parameters within each unique model then you will be using the Parametric Sweep functionality. For example, a sweep over 10 variations of a part dimension along with a sweep over 10 different materials and 10 different model parameters would require solving a similar model 1000 times, and the solution time when running this as a single job on a single computer will be (in the worst case) just about exactly 1000 times greater.

Solution time for sweeps over large numbers of parameters can be reduced by running jobs in parallel, either on a single computer, using any license type, or on a cluster computer, using the Floating Network License.

To solve in parallel on a single computer, use the Batch Sweep functionality. Running parametric sweeps in parallel on a single computer is only advised if all models will fit within memory at the same time. For example, if one instance of the model requires 3GB of RAM to solve, then it can make sense to run four simultaneous jobs on a 16GB RAM computer. For models with small memory requirements, you may see an improvement running as many simultaneous jobs as there are cores. The relative speedup when using Batch Sweep is both model- and hardware-dependent.

To solve Parametric Sweeps in parallel on a cluster, use the Cluster Sweep functionality. There is no limit to the number of parallel jobs that you can run at once (up to the number of of available nodes on the cluster.) You can run on your own cluster or use a third-party cluster. COMSOL maintains a list of Technology Partners who provide on-demand computing resources for cluster computations. Each node of the cluster need only meet the requirements described for running a unique model. For further guidance on cluster hardware, see Knowledge Base 1116.

See also Knowledge Base 1250: Running parametric sweeps, batch sweeps, and cluster sweeps from the command line.

Always consider if you can avoid large sweeps by using the Optimization Module.

OS

In versions of COMSOL Multiphysics prior to version 5.4, Linux and macOS operating systems could outperform Windows on some processors with many cores.

Hard Drives

Solid State Drives give overall better system performance compared to Hard Drives. Faster drives are always better, but if the system is using the drive for swap space (virtual memory) on the models you are solving, it is better to upgrade the RAM rather than to invest in faster drives.

Graphics

We recommend modern AMD or NVIDIA based dedicated graphics cards. A list of tested graphics cards can be found on the system requirements page. The larger the memory in the graphics card, the more complex models can be visualized. Note that just because a models require large amounts of RAM memory to solve does not necessarily mean it will require a large video card to display, and vice-versa.

GPUs

For more information about simulations using GPUs, see Setting Up GPU-Accelerated Computing Within COMSOL Multiphysics

See Also

Selecting hardware for a compute cluster, solution 1116.
COMSOL and Multithreading, solution 1096.
COMSOL macOS Apple Silicon Native Support 1307.
Blogpost: How Large of a Model Can You Solve with COMSOL®?