this post was submitted on 06 Mar 2026
18 points (80.0% liked)
Sysadmin
13471 readers
10 users here now
A community dedicated to the profession of IT Systems Administration
No generic Lemmy issue posts please! Posts about Lemmy belong in one of these communities:
!lemmy@lemmy.ml
!lemmyworld@lemmy.world
!lemmy_support@lemmy.ml
!support@lemmy.world
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm in a similar position as you. Our lab has a partition on HPC but i need a way to quasi-administrate other lab members without truly having root access. What I found works is to have a shared bashrc script (which also contains useful common aliases and env variables) and get all your users to source it (in their own bashrc files). Set the umask within the shared bashrc file. Set certain folders to read only (for common references, e.g. genomes) if you don't want people messing with shares resources. However, I've found that it's only worth trying to admin shared resources and large datasets, otherwise let everyone junk their home folder with their own analyses. If the home folder is size limited, create a user's folder in the scratch partition and let people store their junk there however they want. Just routinely check that nobody is abusing your storage quota.
EDIT: absolutely under no circumstances give people write access to raw shared data on hpc. I guarantee some idiot will edit it and mess it up for everyone. If people need to rename files they can learn how to symlink them.
This is a pretty good idea!
In addition, I recommend having all data e.g. as a (private)datalad archive synchronized to Dataverse, osf, figshare or wherever - edits are versioned then
I am generally using DVC to version data, are those better options?
I don't know, seems to be quite similar :)
Thanks, this is a great idea! I can see you have been doing this for a long time and you're talking from experience. Regarding shared data: I use this more as a way to give raw data to other people and collect results from them. I use it mostly as a temporary directory used to transfer data, anything significant will get copied over to my share and backed up.
I can see how you could be worried about storage quota, luckily I don't have that many people to worry about. But it is funny you mention it as I could really see someone stashing a few conda environments in there just because they finished their inside quota...
If you're not that worried about storage then you can just make copies if necessary, then you don't really have to worry about permissions (apart from read, which is typically default for the same group). But yea if there's any chance more than 1 person might work off the same copy of data on HPC, make it read only for the peace of mind. Regarding conda envs, yea I have a few common read only conda environments so that scripts can be used by multiple users without the hassle of ensuring everyone has the same env. Quite useful.
The shared environment thing seems like a very cool idea! I'll try to set it up.