Device Selection¶
Moreau automatically selects the best available device, or you can manually specify CPU or GPU.
Automatic Selection¶
By default, Moreau uses the best available device:
import moreau
# Uses CUDA if available, otherwise CPU
solver = moreau.Solver(P, q, A, b, cones=cones)
print(f"Using device: {solver.device}")
The automatic selection considers:
CUDA availability
Problem size (small problems may run faster on CPU)
Batch size (larger batches benefit more from GPU)
Manual Device Selection¶
Override device selection via Settings:
# Force CPU
settings = moreau.Settings(device='cpu')
solver = moreau.Solver(P, q, A, b, cones=cones, settings=settings)
# Force CUDA
settings = moreau.Settings(device='cuda')
solver = moreau.Solver(P, q, A, b, cones=cones, settings=settings)
Checking Availability¶
Query available devices:
import moreau
# List all available devices
print(moreau.available_devices()) # ['cpu', 'cuda'] or ['cpu']
# Check specific device
print(moreau.device_available('cuda')) # True/False
# Get default device
print(moreau.default_device()) # 'cuda' or 'cpu'
Global Default¶
Set a global default device:
import moreau
# Set default for all new solvers
moreau.set_default_device('cuda')
# Now all solvers use CUDA by default
solver = moreau.Solver(P, q, A, b, cones=cones) # Uses CUDA
# Can still override per-solver
settings = moreau.Settings(device='cpu')
cpu_solver = moreau.Solver(P, q, A, b, cones=cones, settings=settings)
Framework Integration¶
PyTorch¶
The PyTorch integration follows the same pattern:
import torch
from moreau.torch import Solver
import moreau
cones = moreau.Cones(num_nonneg_cones=2)
# Auto-selects device
solver = Solver(
n=2, m=2,
P_row_offsets=torch.tensor([0, 1, 2]),
P_col_indices=torch.tensor([0, 1]),
A_row_offsets=torch.tensor([0, 1, 2]),
A_col_indices=torch.tensor([0, 1]),
cones=cones,
)
print(f"Device: {solver.device}")
# Force specific device
settings = moreau.Settings(device='cuda')
solver = Solver(
n=2, m=2,
P_row_offsets=torch.tensor([0, 1, 2]),
P_col_indices=torch.tensor([0, 1]),
A_row_offsets=torch.tensor([0, 1, 2]),
A_col_indices=torch.tensor([0, 1]),
cones=cones,
settings=settings,
)
JAX¶
Similarly for JAX:
import jax.numpy as jnp
from moreau.jax import Solver
import moreau
cones = moreau.Cones(num_nonneg_cones=2)
solver = Solver(
n=2, m=2,
P_row_offsets=jnp.array([0, 1, 2]),
P_col_indices=jnp.array([0, 1]),
A_row_offsets=jnp.array([0, 1, 2]),
A_col_indices=jnp.array([0, 1]),
cones=cones,
)
print(f"Device: {solver.device}")
Performance Guidance¶
Scenario |
Recommended Device |
|---|---|
Single small problem (n < 100) |
CPU |
Single large problem (n > 1000) |
CUDA |
Small batch (< 16) |
CPU or CUDA |
Large batch (> 64) |
CUDA |
Training loop |
CUDA |
Quick prototyping |
CPU |
The automatic selection generally makes good choices, but you can override when you know your workload characteristics.