Download and unzip the source code, or clone with:
git clone https://github.com/ProjectPhysX/FluidX3D.git
- There is no "installation" of FluidX3D. Instead, you have to compile the source code yourself.
- I have made this as easy as possible and this documentation will guide you through it. Nontheless, some basic programming experience with C++ would be good for the setup scripts.
- First, compile the code as-is; this is the standard FP32 benchmark test case. By default, the fastest installed GPU will be selected automatically. Compile time is about 5 seconds.
- Download and install Visual Studio Community. In Visual Studio Installer, add:
- Desktop development with C++
- MSVC v142
- Windows 10 SDK
- Open
FluidX3D.sln
in Visual Studio Community. - Compile and run by clicking the ► Local Windows Debugger button.
- To select a specific GPU, open Windows CMD in the
FluidX3D
folder (typecmd
in File Explorer in the directory field and press Enter), then runbin\FluidX3D.exe 0
to select device0
. You can also select multiple GPUs withbin\FluidX3D.exe 0 1 3 6
if the setup is configured as multi-GPU.
- Compile and run with:
chmod +x make.sh ./make.sh
- Compiling requires
g++
withC++17
, which is supported since version8
(check withg++ --version
). If you havemake
installed (check withmake --version
), compiling will will be faster using multiple CPU cores; otherwise compiling falls back to using a single CPU core. - To select a specific GPU, enter
./make.sh 0
to compile+run, orbin/FluidX3D 0
to run on device0
. You can also select multiple GPUs withbin/FluidX3D 0 1 3 6
if the setup is configured as multi-GPU. - Operating system (Linux/macOS/Android) and X11 support (required for
INTERACTIVE_GRAPHICS
) are detected automatically. In case problems arise, you can still manually selecttarget=...
inmake.sh
. - On macOS and Android,
INTERACTIVE_GRAPHICS
mode is not supported, as no X11 is available. You can still useINTERACTIVE_GRAPHICS_ASCII
though, or render video to the hard drive with regularGRAPHICS
mode.
- Now open
src/setup.cpp
. In here are all the sample setups, each one being avoid main_setup() {...}
function block written in C++. Uncomment one of them, maybe start top-to-bottom. - In the line where the
main_setup()
function starts, it says "required extensions in defines.hpp:", followed by a list of extensions in capital letters. Head over tosrc/defines.hpp
and comment outwith a//#define BENCHMARK
//
. Then, uncomment all of the extensions required for the setup by removing the//
in front of the corresponding line. - Finally, compile and run the setup with the ► Local Windows Debugger button (Windows) or
./make.sh
(Linux/macOS/Android). - Once the interactive graphics window opens, press key P to start/pause the simulation, and press H to show the help menu for keyboard controls and visualization settings.
- Go through some of the sample setups this way, get familiar with their code structure and test the graphics mode.
4. Keyboard/Mouse Controls for INTERACTIVE_GRAPHICS
/_ASCII
Key | Function |
---|---|
P | start/pause the simulation |
H | show/hide help menu for keyboard controls and visualization settings |
Esc Alt+F4 |
quit |
Mouse I J K L |
rotate camera |
Scrollwheel + - |
zoom (centered camera mode) or camera movement speed (free camera mode) |
Mouseclick U |
toggle rotation with Mouse and angle snap rotation with I J K L |
F | toggle centered/free camera mode |
W A S D Space C |
move free camera |
Y X | adjust camera field of view |
R | toggle camera autorotation |
G | print current camera position/rotation in console as copy/paste command |
V | toggle stereoscopic rendering for VR |
B | toggle VR-goggles/3D-TV mode for stereoscopic rendering |
N M | adjust eye distance for stereoscopic rendering |
1 | flag wireframe / solid surface (and force vectors on solid cells or surface pressure if the extension is used) |
2 | velocity field |
3 | streamlines |
4 | vorticity (velocity-colored Q-criterion isosurface) |
5 | rasterized free surface |
6 | raytraced free surface |
7 | particles |
T | toggle slice visualization mode |
Z | toggle field visualization mode |
Q E | move slice in slice visualization mode |
- For initializing the simulation box, use call
constructor.
LBM lbm(Nx, Ny, Nz, nu, ...);
Nx
/Ny
/Nz
is the grid resolution andnu
is the kinematic shear viscosity in LBM units. - To use multiple GPUs, use
with
LBM lbm(Nx, Ny, Nz, Dx, Dy, Dz, nu, ...);
Dx
/Dy
/Dz
indicating how many domains (GPUs) there are in each spatial direction. The productDx
×Dy
×Dz
is the total number of domains (GPUs). - As long as the
lbm
object is in scope, you can access the memory. As soon as it goes out of scope, all memory associated with the current simulation is freed again. - The grid resolution
Nx
/Ny
/Nz
ultimately determines the VRAM occupation. Quite often it's not obvious at which resolution you'll overshoot the VRAM capacity of the GPU(s). To aid with this, there is the function:This takes as inputs the desired aspect ratio of the simulation box and the VRAM occupation in MB, and returns the grid resolution as aconst uint3 lbm_N = resolution(float3(1.0f, 2.0f, 0.5f), 2000u);
uint3
with.x
/.y
/.z
components. You can also directly feed theuint3
into the LBM constructor as resolution:LBM lbm(lbm_N, nu, ...);
- The LBM simulation uses a different unit system from SI units, where density
rho=1
and velocityu≈0.001-0.1
, because floating-point arithmetic is most accurate close to1
. - To ease unit conversion from SI to LBM units and back, there is the
units.hpp
struct. By callingthe base unit conversion factors [m], [kg], [s] are calculated and stored in theunits.set_m_kg_s(lbm_length, lbm_velocity, lbm_density=1, si_length, si_velocity, si_density);
units
struct. Thereafter, any of the conversion functions fromsrc/units.hpp
can be used to go from SI to LBM units and back, such aslbm_nu = units.nu(si_nu)
to convert the kinematic viscosity from SI to LBM units. - A good beginner example for this is the "aerodynamics of a cow" setup.
- If not explicitly set, by default all cells have the default values
rho=1
,u=0
,flags=0
. - The initial/boundary conditions of single grid cells are set in a parallelized loop that iterates over the entire grid:
Within this loop, you can set the density, velocity and flags of each cell individually by assigning values to
const uint Nx=lbm.get_Nx(), Ny=lbm.get_Ny(), Nz=lbm.get_Nz(); parallel_for(lbm.get_N(), [&](ulong n) { uint x=0u, y=0u, z=0u; lbm.coordinates(n, x, y, z); // ... });
lbm.rho[n]
,lbm.u.x[n]
,lbm.u.y[n]
,lbm.u.z[n]
andlbm.flags[n]
. Then
here is the linearized 3D grid index, corresponding to an (x
|y
|z
) position via the functionlbm.coordinates(n, x, y, z)
. - For example, to set solid walls at the left and right sides of the simulation box, write
within the loop.
if(y==0u||y==Ny-1u) lbm.flags[n] = TYPE_S;
- All box sides where no solid (
TYPE_S
) or other boundary type are set will remain periodic boundaries. - Primitive geometry, such as spheres, ellipsoids, cubes, cuboids, cylinders, codes, pipes, triangles, inclined planes, or toruses can be set with the functions from
shapes.hpp
. Example to insert a cylinder:if(cylinder(x, y, z, lbm.center(), float3(axis_x, axis_z, axis_z), radius) lbm.flags[n] = TYPE_S;
- The non-moving no-slip mid-grid bounce-back boundaries (
TYPE_S
) are always available without further extensions. "No-slip bounce-back" refers to the property that the flow velocity directly at the boundary is 0 (no-slip condition). "Mid-grid" refers to the boundary being located exactly in the middle between the boundary cells and adjacent fluid cells. - For inflow/outflow boundaries, you need to enable (uncomment) the
EQUILIBRIUM_BOUNDARIES
extension. Then, for the specific inflow/outflow cells, set the flaglbm.flags[n] = TYPE_E
and on the same cell specify either a densitylbm.rho[n]
unequal to1
or a velocitylbm.u.x[n]
/lbm.u.y[n]
/lbm.u.z[n]
unequal to0
, or a combination of both.TYPE_E
cells enforce the specified density/velocity value and absorb any incoming shockwaves. - For moving solid boundaries, you need to enable (uncomment) the
MOVING_BOUNDARIES
extension. Then, for the specific solid cells, set the flaglbm.flags[n] = TYPE_S
and on the same cell specify a velocitylbm.u.x[n]
/lbm.u.y[n]
/lbm.u.z[n]
unequal to0
.TYPE_S
cells reflect any incoming shockwaves. - If strict mass conservation is required (for example flow through a linear pipe), use periodic boundaries (i.e. don't set any boundary type on the cells at these simulation box sides), and drive the flow with a volume force (equivalent to a pressure gradient). Therefore you need to enable (uncomment) the
VOLUME_FORCE
extension, and in the LBM constuctor set the force per volume (fx
|fy
|fz
):These force per volume values should not exceedLBM lbm(Nx, Ny, Nz, nu, fx, fy, fz);
0.001
in magnitude.
- Call
lbm.run()
(without input parameter, it's infinite time steps) to initialize and execute the setup, orlbm.run(time_steps)
to execute only a specific number of time steps. - If you have a more complicated simulation loop where you periodically compute time steps and render images for a video or export data, don't forget to place an
lbm.run(0u)
before that loop. This copies the initial/boundary conditions from CPU RAM to GPU VRAM and initializes the simulation on the GPU, without computing a time step. Without initialization, there is no data in VRAM yet for rendering.
- For more complex geometries, you can load
.stl
triangle meshes and voxelize them to the Cartesian simulation grid on the GPU(s). - Create a
FluidX3D/stl/
folder next to theFluidX3D/src/
folder and download the geometry from websites like Thingiverse, or create your own. - Only binary
.stl
files are supported. For conversion from other formats or for splitting composite geometries like helicopter hull and rotors, I recommend Microsoft 3D Builder on Windows or Blender on Windows/Linux. - Load and voxelize simple
.stl
files directly withThis automatically repositions/rescales the mesh to the specified center. Uselbm.voxelize_stl(get_exe_path()+"../stl/mesh.stl", center, rotation, size);
lbm.center()
for the simulation box center, or add an offset with a+float3(offset_x, offset_y, offset_z)
. You can generate and multiply together a rotation matrix like this (example: rotation around the z-axis by 180°, then around the x-axis by 90°):float3x3 rotation = float3x3(float3(1, 0, 0), radians(90.0f))*float3x3(float3(0, 0, 1), radians(180.0f));
- To load composite geometries with several parts without automatic mesh repositioning/rescaling, use
to load the meshes from the
Mesh* mesh_1 = read_stl(const string& path, const float scale=1.0f, const float3x3& rotation=float3x3(1.0f), const float3& offset=float3(0.0f)); // load mesh without automatic repositioning/rescaling Mesh* mesh_2 = read_stl(const string& path, const float scale=1.0f, const float3x3& rotation=float3x3(1.0f), const float3& offset=float3(0.0f)); mesh_1->scale(const float scale); // manually scale meshes mesh_2->scale(const float scale); mesh_1->translate(const float3& translation); // manually reposition meshes mesh_2->translate(const float3& translation); lbm.voxelize_mesh_on_device(mesh_1); // voxelize meshes on GPU lbm.voxelize_mesh_on_device(mesh_2);
.stl
files, manually scale/reposition all parts of the mesh the same time, and finally voxelize them on the GPU. - To aid with repositioning the mesh, there is
lbm.center()
for the center of the simulation box, as well as the min/max bounding-box coordinates of the meshmesh->pmin
/mesh->pmax
, each afloat3
with (x
|y
|z
) components. - Rotating geometries have to be periodically revoxelized, about every 1-10 LBM time steps. In the main simulation loop in the
main_setup()
function, first rotate the triangle mesh, then revoxelize on GPU, then compute a few LBM time steps:Hereconst uint lbm_T = 100000u; // number of LBM time steps to simulate const uint lbm_dt = 4u; // number of LBM time steps between each mesh revoxelization lbm.run(0u); // initialize simulation while(lbm.get_t()<lbm_T) { // main simulation loop mesh->rotate(float3x3(float3(0, 0, 1), lbm_omega*(float)lbm_dt)); // rotate the triangle mesh lbm.voxelize_mesh_on_device(mesh, TYPE_S, center, float3(0.0f), float3(0.0f, 0.0f, lbm_omega)); // revoxelize the rotated triangle mesh, provide the instantaneous angular velocity vector for moving boundaries lbm.run(lbm_dt); // run lbm_dt LBM time steps }
lbm_omega
is the angular velocity in radians per time step,lbm_dt
is the number of simulated time steps between revoxelizations, andfloat3(0.0f, 0.0f, lbm_omega)
is the instantaneous angular velocity as a vector along the axis of rotation. The largest displacement of the outermost cells should not exceed1
cell between revoxelizations; setlbm_omega = lbm_u/lbm_radius
accordingly. - Have a look at the "Cessna 172" and "Bell 222" setups for some examples.
- For video rendering, disable (comment out)
INTERACTIVE_GRAPHICS
andINTERACTIVE_GRAPHICS_ASCII
and enable (uncomment)GRAPHICS
insrc/defines.hpp
. - Set the video resolution as
GRAPHICS_FRAME_WIDTH
/GRAPHICS_FRAME_HEIGHT
and the background color asGRAPHICS_BACKGROUND_COLOR
. You can also adjust the otherGRAPHICS_...
options there, such as semi-transparent rendering mode, or adjust the color scale for velocity withGRAPHICS_U_MAX
. - A basic loop for rendering video in the
main_setup()
function looks like this:lbm.graphics.visualization_modes = VIS_FLAG_LATTICE|VIS_Q_CRITERION; // set visualization modes, see all available visualization mode macros (VIZ_...) in defines.hpp const uint lbm_T = 10000u; // number of LBM time steps to simulate lbm.run(0u); // initialize simulation while(lbm.get_t()<lbm_T) { // main simulation loop if(lbm.graphics.next_frame(lbm_T, 25.0f)) { // render enough frames for 25 seconds of 60fps video lbm.graphics.set_camera_free(float3(2.5f*(float)Nx, 0.0f*(float)Ny, 0.0f*(float)Nz), 0.0f, 0.0f, 50.0f); // set camera to position 1 lbm.graphics.write_frame(get_exe_path()+"export/camera_1/"); // export image from camera position 1 lbm.graphics.set_camera_centered(-40.0f, 20.0f, 78.0f, 1.25f); // set camera to position 2 lbm.graphics.write_frame(get_exe_path()+"export/camera_2/"); // export image from camera position 2 } lbm.run(1u); // run 1 LBM time step }
- To find suitable camera placement, run the simulation at low resolution in
INTERACTIVE_GRAPHICS
mode, rotate/move the camera to the desired position, click the Mouse to disable mouse rotation, and press G to print the current camera settings as a copy-paste command in the console. Alt+Tab to the console and copy the camera placement command by selecting it with the mouse and right-clicking, then paste it into themain_setup()
function. - The visualization mode(s) can be specified as
lbm.graphics.visualization_modes
with theVIS_...
macros. You can also set thelbm.graphics.slice_mode
(0
=no slice,1
=x,2
=y,3
=z,4
=xz,5
=xyz,6
=yz,7
=xy) and reposition the slices withlbm.graphics.slice_x
/lbm.graphics.slice_y
/lbm.graphics.slice_z
. - Exported frames will automatically be assigned the current simulation time step in their name, in the format
bin/export/image-123456789.png
. - To convert the rendered
.png
images to video, use FFmpeg:ffmpeg -framerate 60 -pattern_type glob -i "export/*/image-*.png" -c:v libx264 -pix_fmt yuv420p -b:v 24M "video.mp4"
- At any point in time, you can export volumetric data as binary
.vtk
files with:lbm.rho.write_device_to_vtk(); lbm.u.write_device_to_vtk(); lbm.flags.write_device_to_vtk(); lbm.phi.write_device_to_vtk(); // only for SURFACE extension lbm.T.write_device_to_vtk(); // only for TEMPERATURE extension lbm.write_mesh_to_vtk(const Mesh* mesh); // for exporting triangle meshes
- These functions first pull the data from the GPU(s) into CPU RAM, and then write it to the hard drive.
- If unit conversion with
units.set_m_kg_s(...)
was specified, the data in exported.vtk
files is automaticlally converted to SI units. - Exported files will automatically be assigned the current simulation time step in their name, in the format
bin/export/u-123456789.vtk
. - Be aware that these volumetric files can be gigantic in file size, tens of GigaByte for a single file.
- You can view/evaluate the
.vtk
files for example in ParaView. - It is recommended to use the C++ functionality in the
main_setup()
function directly to extract the data of interest and selectively only write that to the hard drive. Therefore, calllbm.u.read_from_device()
to copy the data from the GPU(s) to CPU RAM, and then you can access it directly, for exampleto get the x-velocity at the position (const float lbm_velocity_x = lbm.u.x[lbm.index(x, y, z)];
x
|y
|z
) in LBM units. - To convert the velocity from LBM to SI units, use
after having done unit conversion with
const float si_velocity_x = units.si_u(lbm_velocity_x);
units.set_m_kg_s(...)
. - You can also export the
.stl
triangle meshes to binary.vtk
files with:lbm.write_mesh_to_vtk(const Mesh* mesh);
- Enable (uncomment) the
FORCE_FIELD
extension. This extension allows computing boundary forces on every solid cell (TYPE_S
) individually, as well as placing an individual volume force on every fluid cell (not used here). - In the
main_setup()
function's main simulation loop, alternatingly call:The latter computes the boundary forces on the GPU into thelbm.run(lbm_dt); // run lbm_dt LBM time steps lbm.calculate_force_on_boundaries(); // compute boundary forces on GPU on all solid cells (TYPE_S)
lbm.F
field in VRAM. - To copy
lbm.F
from GPU VRAM to CPU RAM, call:You can then access the boundary forces at each individual cell with:lbm.F.read_from_device();
float lbm_force_x_n = lbm.F.x[lbm.index(x, y, z)];
- To sum over all the individual boundary cells that belong to the body, to get the total force on the body, first voxelize the body with
with the additional
lbm.voxelize_mesh_on_device(mesh, TYPE_S|TYPE_X);
TYPE_X
flagging, and then callto sum over all cells markedconst float3 lbm_force = lbm.calculate_force_on_object(TYPE_S|TYPE_X);
TYPE_S|TYPE_X
that belong to the body. You can also useTYPE_Y
for this. - Finally, convert from LBM to SI units with
after having done unit conversion with
const float si_force_x = units.si_F(lbm_force.x);
units.set_m_kg_s(...)
. - See the "Ahmed body" setup for an example. Note that in the highly turbulent regime, computed body forces are too large by up to a factor 2, because even large resolution is not enough to fully capture the turbulent boundary layer. A wall function is needed, I'll scan literature on it.
By now you're already familiar with the additional boundary types through extensions VOLUME_FORCE
, FORCE_FIELD
, EQUILIBRIUM_BOUNDARIES
, and MOVING_BOUNDARIES
. The remaining available model extensions are briefly outlined here:
SURFACE
Extension
- To simulate free water surfaces, enable (uncomment) the
SURFACE
extension. - All cells then get 3 additional flags:
TYPE_F
(fluid),TYPE_I
(interface), andTYPE_G
(gas). Fluid cells are computed with regular LBM. Interface cells account for the extra surface tension forces, if the surface tension coefficientsigma
is set greater than0
in the LBM constructor; the interface is always 1 cell layer thick. Gas cells are not simulated at all and are essentially treated as vacuum. - If not set otherwise in the initial conditions, all cells are initialized as
TYPE_G
by default. As initial conditions, set all cells that should be fluid toThe interface layer will be automatically initialized during initialization withlbm.flags[n] = TYPE_F;
lbm.run(0u)
. - Addidionally to the 3 flags, each cell also gets assigned a fill level
lbm.phi[n]
:1
for fluid cells (TYPE_F
),]0,1[
for interface cells (TYPE_I
), and0
for gas cells (TYPE_G
). You can set this fill level at initialization, additionally to the cell flag. Do not forget to set the cell flag. Iflbm.phi[n]
is not set manually, it will automatically be initialized such that all fluid cells getphi=1
, all interface cells getphi=0.5
, and all gas clls getphi=0
assigned. - For a simple example, see the "dam break" setup. A more advanced sample setup for free surfaces is the "raindrop impact".
TEMPERATURE
Extension
- With the
TEMPERATURE
extension, FluidX3D can model thermal convection flows. This extension automatically also enables theVOLUME_FORCE
extension. - In the LBM constructor, you then need to set the volume force (
fx
|fy
|fz
), the thermal diffusion coefficientalpha
, and the thermal expansion coefficientbeta
, all in LBM units:LBM lbm(Nx, Ny, Nz, nu, fx, fy, fz, 0.0f, alpha, beta); // the "0.0f" is for the surface tension coefficient sigma which is not used here and has to remain 0
- With the extension, each grid cell gets an additional temperature
lbm.T[n]
(in LBM units) assigned. The default temperature in LBM units is1
. - To set temperature boundary conditions, use the flag
TYPE_T
and for the same cells assign a temperature unequal to1
:lbm.flags[n] = TYPE_T; // make the cell n a temperature boundary lbm.T[n] = 1.2f; // set this temperature boundary hotter than average
- See the "Rayleigh-Benard convection" and "thermal convection" setups for two examples.
SUBGRID
Extension
- Fluid flow is characterized by the Reynolds number
Re = x·u∕nu
with a characteristic length scalex
, a characteristic velocityu
and the kinematic shear viscositynu
. Larger length scale, larger velocity or smaller viscosity all mean larger Reynolds number. - The Reynolds number is a unit-less number. A low value Re < 2300 means laminar flow, a high value Re > 2900 means turbulent flow. In between is a transitional regime.
- For very large Reynolds number Re > 100000, the LBM solver becomes unstable, as tiny, very fast rotating vortices can be present in the flow field, and too fast velocity and shear rate makes the simulation blow up.
- To tackle this problem, there is subgrid models that model vortices smaller than single grid cells. This works by increasing the effective vscosity where the shear rate is large and lots of small eddies are assumed to be present. Coincidentally, locations of high shear rate and low viscosity cause instability, so increasing effective viscosity there keeps the simulation stable.
- The subgrid model in FLuidX3D is the Smagorinsky-Lilly model. You can enable it with the
SUBGRID
extension. - There is no additional performance cost for this extension.
PARTICLES
Extension
- By default, the LBM is a grid-based simulation, so there are no particles.
- But the
PARTICLES
extension allows to add particles to the simulation, either as passive tracers or as 2-way-coupled particles that can do floating/sedimentation. - For passive tracers, only enable the
PARTICLES
extension, and in the LBM constructor simply add the particle count:LBM lbm(Nx, Ny, Nz, nu, 50000u); // this will create 50000 particles
- Then, in initialization, make a loop over all particles (outside of the initialization loop that iterates over all grid cells):
for(ulong n=0ull; n<lbm.particles->length(); n++) { lbm.particles->x[n] = random_symmetric(0.5f*lbm.size().x); // this will palce the particles randomly anywhere in the simulation box lbm.particles->y[n] = random_symmetric(0.5f*lbm.size().y); lbm.particles->z[n] = random_symmetric(0.5f*lbm.size().z); }
- Note that the position (
0
|0
|0
) for particles corresponds to the simulation box center. - For 2-way-coupled particles, additionally enable the
VOLUME_FORCE
andFORCE_FIELD
extensions, and in the LBM constructor add the particle density (in LBM units) unequal to1
:LBM lbm(Nx, Ny, Nz, nu, 50000u, 1.2f); // this will create 50000 particles that are more dense than the fluid and will sink to the bottom
- Sometimes in the velocity field or streamlines visualization, you will see fuzzyness, or something that looks like a rapidly growing white crystal, blowing up from a certain point and filling the entire simulation box. This is instability, i.e. when velocities turn
NaN
orInf
. - Often times, the cause of instability is an unfortunate choice of unsuitable parameters:
- too high/low density
rho
(ideally should be very close to1
at all times) - too high velocity
u
(must never exceed0.57
anywhere in the box, ideally should be somewhere around0.1
, but can be as small as0.001
) - too low kinematic shear viscosity
nu
(ideally close to1/6
, becomes unstable when it's very very close to0
(then enable theSUBGRID
extension), and should not exceed3
) - too high force per volume (
fx
|fy
|fz
) (should not exceed0.001
in magnitude) - too high surface tension coefficient
sigma
(should not exceed0.1
)
- too high/low density
- The best parametrization for LBM simulations is an art in itself and needs some practice.