iiiiiiitttttttttttt

1716 readers

1 users here now

you know the computer thing is it plugged in?

A community for memes and posts about tech and IT related rage.

founded 1 year ago

MODERATORS

irelephant@programming.dev

irelephant@lemmy.dbzer0.com

Shivering@programming.dev

neidu3@sh.itjust.works

KingTootsie@programming.dev

AnarchistArtificer@slrpnk.net

A user attempted to allocate 32 TERABYTES of memory for a job (lemmy.blahaj.zone)

submitted 1 month ago* (last edited 1 month ago) by rockSlayer@lemmy.blahaj.zone to c/iiiiiiitttttttttttt@programming.dev

0 comments fedilink hide all child comments

I like my job, so no screenshots. Sorry.

Notes:

sbatch is a command for submitting jobs on high performance compute nodes
the huge-n128-512g node uses 128 cores and has 512GiB of memory
This is occurring in a medical research nonprofit

User: Hello everyone, this is the first time I'm using GCP. I'm trying to run a job, but it keeps failing. These are the sbatch headers I'm using:

#SBATCH --partition=huge-n128-512g
#SBATCH --nodes=8
#SBATCH --mail-user=user@institute.org
#SBATCH --mail-type=FAIL
#SBATCH --mem-per-cpu=32G

IT: Please make sure you need to use that node, each one costs $4500/month to use. Can you describe the job you're trying to do?

User: I'm doing high-depth genetic sequencing using 3gb bam files.

(additional note: there's usually only 1 bam file per chromosome, so 69gb total. Nice.)

IT: Those bam files are pretty small. I'd recommend starting with the med-n16-64g node and moving up if needed. We're only billed for run time. If the jobs take the same amount of time, it would be 13% of the cost.

The astute among you will notice that an 8 node swarm of 32GiB of memory per core is 32TiB total. The job was failing because the --mem-per-cpu flag was going above the available memory on each node. Even without that flag, the swarm would have used 4TiB memory. Holy overallocation, Batman!

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here