Skip to content
System Status: All systems are operational • Services are available and operational.
Click for detailed status

Your first job on Euler

Welcome to your first hands-on experience with Euler! This tutorial will guide you through submitting and managing your very first computational job on the cluster. By the end of this tutorial, you'll understand how to write job scripts, submit them to the queue, monitor their progress, and retrieve results.

Prerequisites

Before starting this tutorial, make sure you:

What you'll learn

  • How to write a simple job script
  • How to submit jobs using sbatch
  • How to monitor job status with squeue and myjobs
  • How to retrieve and understand job output
  • Common troubleshooting steps

Step 1: Understanding the environment

First, let's explore where you are and what's available:

# Check your current location
pwd
Expected output: /cluster/home/yourusername

# Check available storage
lquota
Expected output: A table showing your storage quotas for home, scratch, and any group storage.

# Check what modules are available
module avail

Step 2: Create your first job script

Let's create a simple job that demonstrates basic SLURM functionality. We'll make a script that:

  1. Prints system information
  2. Runs a simple calculation
  3. Creates some output files

Create a new file called my_first_job.sh:

nano my_first_job.sh

Copy and paste this content into the file:

#!/bin/bash
#SBATCH --job-name=my_first_job
#SBATCH --time=00:05:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=1G
#SBATCH --output=first_job_%j.out
#SBATCH --error=first_job_%j.err

# Print job information
echo "=========================================="
echo "Job started on: $(date)"
echo "Job ID: $SLURM_JOB_ID"
echo "Running on node: $SLURMD_NODENAME"
echo "Number of CPUs: $SLURM_CPUS_PER_TASK"
echo "Working directory: $PWD"
echo "=========================================="

# Print system information
echo "System information:"
echo "Hostname: $(hostname)"
echo "Operating System: $(uname -a)"
echo "CPU info: $(lscpu | grep 'Model name' | head -1)"
echo "Memory info: $(free -h | grep 'Mem:')"
echo ""

# Do some simple calculations
echo "Performing calculations..."
echo "Computing squares of numbers 1-10:"
for i in {1..10}; do
    square=$((i * i))
    echo "$i squared = $square"
done

# Create some output files
echo "Creating output files..."
echo "Hello from Euler!" > hello.txt
echo "Job completed successfully" > status.txt

# List files in current directory
echo ""
echo "Files created:"
ls -la *.txt

echo ""
echo "=========================================="
echo "Job completed on: $(date)"
echo "Total runtime: $SECONDS seconds"
echo "=========================================="

Save the file (Ctrl+X, then Y, then Enter in nano).

Let's examine what each part does:

Understanding the SBATCH Directives

  • --job-name: A descriptive name for your job
  • --time: Maximum runtime (5 minutes for this example)
  • --ntasks: Number of tasks (1 for a simple serial job)
  • --cpus-per-task: CPUs per task (1 CPU is sufficient)
  • --mem-per-cpu: Memory per CPU (1GB should be plenty)
  • --output: Where to save standard output (%j gets replaced with job ID)
  • --error: Where to save error messages

Step 3: Submit your job

Now let's submit the job to the queue:

sbatch my_first_job.sh

Expected output: Submitted batch job 12345678 (your job ID will be different)

The job ID is important - write it down! You'll use it to monitor and reference your job.

Step 4: Monitor your job

Let's check the status of your job:

# Check job status (replace 12345678 with your actual job ID)
squeue -j 12345678

Expected output while running:

JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
12345678      gfs my_first_job username  R       0:01      1 eu-g2-001

You can also use the more user-friendly myjobs command:

myjobs

This shows all your jobs in a more readable format.

Job states you might see:

  • PD (Pending): Waiting in queue
  • R (Running): Currently executing
  • CG (Completing): Finishing up
  • CD (Completed): Finished successfully

Step 5: Retrieve and examine results

Once your job completes (should take less than a minute), let's examine the results:

# List files in your directory
ls -la

You should see:

  • my_first_job.sh (your original script)
  • first_job_12345678.out (output file with your job ID)
  • first_job_12345678.err (error file, hopefully empty)
  • hello.txt and status.txt (files created by your job)

Let's examine the output:

# View the main output file
cat first_job_*.out

You should see detailed information about your job execution, system info, calculations, and timestamps.

# Check if there were any errors
cat first_job_*.err

This file should be empty if everything went well.

# Check the files your job created
cat hello.txt
cat status.txt

Step 6: Your second job - with parameters

Now let's create a more sophisticated job that takes parameters. Create parametric_job.sh:

nano parametric_job.sh
#!/bin/bash
#SBATCH --job-name=param_job
#SBATCH --time=00:03:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem-per-cpu=512M
#SBATCH --output=param_job_%j.out

# Set default values
NUMBER=${1:-100}
OPERATION=${2:-"square"}

echo "Job started: $(date)"
echo "Parameter 1 (number): $NUMBER"
echo "Parameter 2 (operation): $OPERATION"
echo "Using $SLURM_CPUS_PER_TASK CPUs"

# Perform operation based on parameter
case $OPERATION in
    "square")
        result=$((NUMBER * NUMBER))
        echo "$NUMBER squared = $result"
        ;;
    "cube")
        result=$((NUMBER * NUMBER * NUMBER))
        echo "$NUMBER cubed = $result"
        ;;
    "double")
        result=$((NUMBER * 2))
        echo "$NUMBER doubled = $result"
        ;;
    *)
        echo "Unknown operation: $OPERATION"
        echo "Supported operations: square, cube, double"
        exit 1
        ;;
esac

echo "Result: $result"
echo "Job completed: $(date)"

Submit this job with different parameters:

# Default parameters (100, square)
sbatch parametric_job.sh

# Custom parameters
sbatch parametric_job.sh 25 cube
sbatch parametric_job.sh 7 double

Step 7: Interactive jobs

Sometimes you want to run commands interactively on a compute node. Here's how:

# Start an interactive session
srun --time=00:10:00 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=1G --pty bash

Once the interactive session starts, you'll see a different prompt indicating you're on a compute node:

# You're now on a compute node! Try these commands:
hostname
whoami
echo "I'm running on a compute node!"

# Exit the interactive session
exit

Step 8: Checking job history

View information about completed jobs:

# See your recent jobs
sacct --format=JobID,JobName,State,ExitCode,Elapsed,ReqMem,MaxRSS

This shows you historical information about your jobs, including how much memory they actually used.

Common issues and solutions

Problem: Job stays in PENDING (PD) state

Possible causes:

  • Resource request too high
  • Cluster is busy
  • Invalid resource specification

Solution: Check with squeue -j JOBID and look at the REASON column.

Problem: Job fails immediately

Check:

  1. Error file: cat first_job_*.err
  2. Exit code: sacct -j JOBID --format=ExitCode
  3. Script permissions: chmod +x my_first_job.sh

Problem: "Permission denied" error

Solution:

chmod +x my_first_job.sh

Problem: Out of memory

Symptoms: Job gets killed, error mentions memory Solution: Increase --mem-per-cpu in your script

Best practices you've learned

  1. Always specify resources: Time, memory, CPUs
  2. Use descriptive job names: Makes monitoring easier
  3. Include logging: Print start/end times and job info
  4. Handle errors gracefully: Check exit codes
  5. Test with small jobs first: Before running large computations

Next steps

Congratulations! You've successfully:

✅ Created and submitted your first job
✅ Monitored job execution
✅ Retrieved and analyzed results
✅ Learned about interactive sessions
✅ Understanding job parameters

What's next?

Quick reference

# Submit a job
sbatch my_script.sh

# Check job status
squeue -u $USER
myjobs

# Cancel a job
scancel JOBID

# Interactive session
srun --pty bash

# Job history
sacct

Well Done!

You've completed your first job on Euler! You now have the foundation to run computational work on the cluster. Remember: start small, test thoroughly, and don't hesitate to ask for help when needed.

Need Help?

If you encounter issues or have questions about job submission, check our FAQ or contact support.