logo

Debian Buster (10.2) running CUDA 10.0 and MuMax3.10β2

Foreword

This is a reference for other members of the lab on how to get CUDA 10.0 running on a Debian Buster server with MuMax3.10β2. In case they want to set up a second server without my presence. As always, read documentation and man pages!

Installation requirements

This assumes you have a functioning Debian Buster (10.2) clean installation with root or superuser privileges. For those who are unsure of what that means, you could change to the root account,

    user@mymachine:~/$ su
    Password: 
    $

In this case I am assuming the root user has a prompt $ for brevity.

Or be part of the sudoers file and issue each command by prepending sudo. There are many documents online about setting this up. sudo needs to be installed from a superuser apt install sudo.

You should look at adding non-free and contrib to /etc/apt/sources.list as such,

    $ cat /etc/apt/sources.list
    # Superfluous comments removed for brevity
    deb http://ftp.ca.debian.org/debian buster main contrib non-free
    deb-src http://ftp.ca.debian.org/debian buster main contrib non-free

    deb http://deb.debian.org/debian-security/ buster/updates main contrib non-free
    deb-src http://deb.debian.org/debian-security/ buster/updates main contrib non-free

    deb http://ftp.ca.debian.org/debian buster-updates main contrib non-free
    deb-src http://ftp.ca.debian.org/debian buster-updates main contrib non-free

After run as superuser apt update.

NVIDIA drivers and CUDA

A NVIDIA CUDA supported card with a compute power of 3 or higher is necessary as CUDA 10.0 removes support for cards with a compute power lower than 3. This information is found on NVIDIA's CUDA website.

Using install the following packages, where apt is the debian package manager.

    $ apt install nvidia-driver # should be version 418.74
    $ apt install nvidia-smi    # used to monitor the gfx card status

Download CUDA 10.0 deb (local) file for Ubuntu 18.04. This is compatible with Debian Buster. The options to select on the CUDA 10.0 archive are as follows.

CUDA downloads settings

Once you download the 1.6 GB file, issue the commands as root or append each command by sudo.

    $ dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
    $ apt-key add /var/cuda-repo-<version>/7fa2af80.pub
    $ apt-get update
    $ apt-get install cuda

For the CUDA library API you need to issue the command

    $ apt install libcuda1

From here you may want to ensure you have the nvidia-persistenced daemon running for monitioring the card. To ensure this starts at boot issue the following,

    $ systemctl enable nvidia-persistenced
    $ service nvidia-persistenced start

MuMax3

The reason we are not using CUDA from the Debian repository is because we require MuMax3.10β2 for current and future research at this time. This has a hard dependency for CUDA 10.0. At this time we are using the following package mumax3.10beta2.cuda100.linux64.tar.gz, which can be downloaded under the mumax 3.10β Assets tab. Install into /usr/local/bin if you want to have every user to have access to MuMax3.

    $ wget https://github.com/mumax/3/releases/download/3.10beta/mumax3.10beta2.cuda100.linux64.tar.gz
    $ tar -xzvf mumax3.10beta2.cuda100.linux64.tar.gz -C /usr/local/bin

You should be able to run MuMax3 simulations from any account on the server!

Job Queueing

If you have a complicated setup, it may be wise to set up SLURM or task spooler for scheduling tasks; this is out of the scope of this document. Our case doesn't need a complicated setup and we can use at or batch, but mostly at.

For Users

We are using the at command. This is very convienent as we can add commands to the queue and they can process when available or at a later time; depending on the options passed. The at command requires a time and reads input commands from a file on STDIN or using the -f option for file.

    $ cat mycommandstoexecute.sh
    mumax3 mymagsim.sh -wait
    $ at now + 1 days < mycommandstoexecute.sh

The above will execute the commands listed in mycommandstoexecute.sh, 1 day from now.

You can also enter the commands manually, as at will drop to a command prompt to write the commands you want to execute, to finish you need to end by key sequence CTRL-d; this translates to EOF (end of file).

    $ at now + 12 hours
    warning: commands will be executed using /bin/sh
    at> mumax3 mymagsim.mx3 -wait
    at> CTRL-d
    job 1 at Fri Feb 21 14:00:00 2020

The -wait option isn't part of MuMax3. mumax3 is a wrapper script to prevent multiple instances running at the same time. See For Administrators on it's definition.

at command cannot take multiple commands to execute on the same line, hence the need for a script input file or write the commands in the at> prompt.

If you do not want to send completion messages to the administrator, use at -M.

For Administrators

To have some minimal scheduling we need to install at, which comes with the series of at* commands and batch,

    $ apt install at

For MuMax3 I moved the binary file and wrote a small script to only execute when no instances of MuMax3 is running

    $ mv /usr/local/bin/mumax3 /usr/local/bin/mumax3.10beta

Below is the new minimal definition of mumax3 to prevent multiple instances running.

    $ cat /usr/local/bin/mumax3
    #!/bin/sh
    #
    # 0BSD License 
    # 
    # Copyright (c) 2020, David Kalliecharan 
    # 
    # Permission to use, copy, modify, and/or distribute this software 
    # for any purpose with or without fee is hereby granted.
    # 
    # THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL 
    # WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED 
    # WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE 
    # AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR 
    # CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS 
    # OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, 
    # NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN 
    # CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 
    # 
    #
    # The mumax3 script is a minimal wrapper for mumax3.10beta (originally mumax3). 
    # It checks if a mumax3.10beta instance is running; this is to prevent two
    # instances from running and overloading the NVIDIA GPU. A clever/ignorant  user 
    # will just run mumax3.10beta, but most people don't expect the executable to be
    # renamed. This script is handy when using with at/batch command.
    
    EXEC=mumax3.10beta
    WAIT=false
    
    # Parse arguments for the '-wait' option to block until mumax3.10beta is free
    for arg in "$@"
    do
            if [ "$arg" = "-wait" ];
            then
                    WAIT=true
                    continue
            fi
            ARGS="$ARGS $arg"
    
    done
    
    PID=$(pgrep $EXEC)
    
    # Check for PID of mumax3.10beta, print user info if present
    if [ -n "$PID" ]
    then
            ps -o user,pid,%cpu,%mem,etime,command $PID
            if [ "$WAIT" = "false" ]
            then
                    echo "!! ERROR: MuMax3 instance is currently running"
                    exit
            fi
    
            # Wait until PID stops 
            echo "!! WARNING: MuMax3 is currently running, waiting until proccess completes"
            while [ -n "$PID" ]
            do
                    PID=$(pgrep $EXEC)
                    sleep 1
            done
    fi
    
    mumax3.10beta $ARGS

© 2017–2022 David Kalliecharan