Device Selection

Moreau automatically selects the best available device, or you can manually specify CPU or GPU.

Automatic Selection

By default, Moreau uses the best available device:

import moreau

# Uses CUDA if available, otherwise CPU
solver = moreau.Solver(P, q, A, b, cones=cones)
print(f"Using device: {solver.device}")

The automatic selection considers:

  1. CUDA availability

  2. Problem size (small problems may run faster on CPU)

  3. Batch size (larger batches benefit more from GPU)


Manual Device Selection

Override device selection via Settings:

# Force CPU
settings = moreau.Settings(device='cpu')
solver = moreau.Solver(P, q, A, b, cones=cones, settings=settings)

# Force CUDA
settings = moreau.Settings(device='cuda')
solver = moreau.Solver(P, q, A, b, cones=cones, settings=settings)

Checking Availability

Query available devices:

import moreau

# List all available devices
print(moreau.available_devices())  # ['cpu', 'cuda'] or ['cpu']

# Check specific device
print(moreau.device_available('cuda'))  # True/False

# Get default device
print(moreau.default_device())  # 'cuda' or 'cpu'

Global Default

Set a global default device:

import moreau

# Set default for all new solvers
moreau.set_default_device('cuda')

# Now all solvers use CUDA by default
solver = moreau.Solver(P, q, A, b, cones=cones)  # Uses CUDA

# Can still override per-solver
settings = moreau.Settings(device='cpu')
cpu_solver = moreau.Solver(P, q, A, b, cones=cones, settings=settings)

Framework Integration

PyTorch

The PyTorch integration follows the same pattern:

import torch
from moreau.torch import Solver
import moreau

cones = moreau.Cones(num_nonneg_cones=2)

# Auto-selects device
solver = Solver(
    n=2, m=2,
    P_row_offsets=torch.tensor([0, 1, 2]),
    P_col_indices=torch.tensor([0, 1]),
    A_row_offsets=torch.tensor([0, 1, 2]),
    A_col_indices=torch.tensor([0, 1]),
    cones=cones,
)
print(f"Device: {solver.device}")

# Force specific device
settings = moreau.Settings(device='cuda')
solver = Solver(
    n=2, m=2,
    P_row_offsets=torch.tensor([0, 1, 2]),
    P_col_indices=torch.tensor([0, 1]),
    A_row_offsets=torch.tensor([0, 1, 2]),
    A_col_indices=torch.tensor([0, 1]),
    cones=cones,
    settings=settings,
)

JAX

Similarly for JAX:

import jax.numpy as jnp
from moreau.jax import Solver
import moreau

cones = moreau.Cones(num_nonneg_cones=2)

solver = Solver(
    n=2, m=2,
    P_row_offsets=jnp.array([0, 1, 2]),
    P_col_indices=jnp.array([0, 1]),
    A_row_offsets=jnp.array([0, 1, 2]),
    A_col_indices=jnp.array([0, 1]),
    cones=cones,
)
print(f"Device: {solver.device}")

Performance Guidance

Scenario

Recommended Device

Single small problem (n < 100)

CPU

Single large problem (n > 1000)

CUDA

Small batch (< 16)

CPU or CUDA

Large batch (> 64)

CUDA

Training loop

CUDA

Quick prototyping

CPU

The automatic selection generally makes good choices, but you can override when you know your workload characteristics.