Author Topic: Job Manager for remote SLURM execution of QuantumATK scripts  (Read 441 times)

0 Members and 1 Guest are viewing this topic.

Offline ssp

  • New ATK user
  • *
  • Posts: 8
  • Country: cn
  • Reputation: 0
    • View Profile
I can't  set remote SLURM of QuantumATK,I need help
It's my setting:

Offline Pieter Vancraeyveld

  • QuantumWise Staff
  • Regular ATK user
  • *****
  • Posts: 19
  • Country: dk
  • Reputation: 1
    • View Profile
It appears that 'bash' is not available on the server. Can you manually login and execute the command 'which bash'?

Offline ssp

  • New ATK user
  • *
  • Posts: 8
  • Country: cn
  • Reputation: 0
    • View Profile
Hello, I can log in to my supercomputer server, I can use the "which bash" command and the result is "/ usr/bin/bash", or I can use the script command "sbatch atk.slurm" to do the calculation. I list my files in the attachment.
« Last Edit: July 2, 2020, 05:01 by ssp »

Offline Pieter Vancraeyveld

  • QuantumWise Staff
  • Regular ATK user
  • *****
  • Posts: 19
  • Country: dk
  • Reputation: 1
    • View Profile
It is not clear why the diagnostic check fails if you can manually execute 'which bash'.

Can you empty the 'Scripts to source' field in the 'Environment' tab of the machine settings and submit a new script using the job manager?

Offline ssp

  • New ATK user
  • *
  • Posts: 8
  • Country: cn
  • Reputation: 0
    • View Profile
I tried, but it still couldn't work. Here are my settings.
« Last Edit: July 7, 2020, 07:19 by ssp »

Offline Pieter Vancraeyveld

  • QuantumWise Staff
  • Regular ATK user
  • *****
  • Posts: 19
  • Country: dk
  • Reputation: 1
    • View Profile
Can you execute the following commands one by one and share the output?

ssh liuxiaolin@10.10.114.204 'echo $0; echo $SHELL; env'
ssh liuxiaolin@10.10.114.204
echo $0
echo $SHELL
env
exit

Thanks, Pieter

Offline ssp

  • New ATK user
  • *
  • Posts: 8
  • Country: cn
  • Reputation: 0
    • View Profile
Thank you. I put the output in the output.txt file.

Offline Pieter Vancraeyveld

  • QuantumWise Staff
  • Regular ATK user
  • *****
  • Posts: 19
  • Country: dk
  • Reputation: 1
    • View Profile
I can see that my instructions were not 100% clear. Can you also execute

ssh liuxiaolin@10.10.114.204 'echo $0; echo $SHELL; env'

as a single command and provide the output?

Offline ssp

  • New ATK user
  • *
  • Posts: 8
  • Country: cn
  • Reputation: 0
    • View Profile
Thank you. Here is the result of my entire input "ssh liuxiaolin@10.10.114.204 'echo $0; echo $SHELL; env'"

Offline Pieter Vancraeyveld

  • QuantumWise Staff
  • Regular ATK user
  • *****
  • Posts: 19
  • Country: dk
  • Reputation: 1
    • View Profile
Your output contains a lot of "command not found" errors. This indicates errors in the scripts being sourced at login. Can you check or share ~/.bashrc on your login node? It is likely that PATH is set incorrectly.

Offline ssp

  • New ATK user
  • *
  • Posts: 8
  • Country: cn
  • Reputation: 0
    • View Profile
Thank you,it is my .bashrc file.

Offline Pieter Vancraeyveld

  • QuantumWise Staff
  • Regular ATK user
  • *****
  • Posts: 19
  • Country: dk
  • Reputation: 1
    • View Profile
Please ssh to your login node and execute the following command:

bash -c 'echo $0; echo $-; env'

What output do you get?

Offline ssp

  • New ATK user
  • *
  • Posts: 8
  • Country: cn
  • Reputation: 0
    • View Profile
This is my out file.

Offline Pieter Vancraeyveld

  • QuantumWise Staff
  • Regular ATK user
  • *****
  • Posts: 19
  • Country: dk
  • Reputation: 1
    • View Profile
Based on your input, it appears that one of your bash startup scripts (.profile, .bashrc, .bash_profile) contains an error. After you have fixed it, you should be able to execute

ssh liuxiaolin@10.10.114.204 'echo $0; echo $SHELL; env'

as a single command without getting error messages.

Offline ssp

  • New ATK user
  • *
  • Posts: 8
  • Country: cn
  • Reputation: 0
    • View Profile
Re: Job Manager for remote SLURM execution of QuantumATK scripts
« Reply #14 on: July 10, 2020, 17:11 »
Thank you,Now I can use Job Manager to deliver tasks to the supercomputer. Our school  supercomputer has two login addresses, 10.10.114.203 and 10.10.114.204. I  change the IP address to 203 and it run .But I use xshell to login 203 and 204 is the same account, login in 204 can also use the command “sabtch atk.slurm” to submit the task, I don't know why,but it  possible doesn't  matter. .

users use high-performance computing clusters, need to log in to the cluster of landing nodes login1/login2, can log in through a variety of ways. The ip address of the login1 is the ip address of the http://10.10.114.203;login2 is http://10.10.114.204 ; the two landing nodes have the functions of shunt and redundancy.