HomeНаука и техникаRelated VideosMore From: Ole Tange

Part 2: GNU Parallel script processing and execution

273 ratings | 24985 views
GNU Parallel version 20100620 http://www.gnu.org/software/parallel/ is a shell tool for executing jobs in parallel locally or using remote machines. A job is typically a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. If you use xargs today you will find GNU parallel very easy to use as GNU parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. If you use ppss or pexec you will find GNU parallel will often make the command easier to read. GNU parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU parallel as input for other programs. For each line of input GNU parallel will execute command with the line as arguments. If no command is given, the line of input is executed. Several lines will be run in parallel. GNU parallel can often be used as a substitute for xargs or cat | bash.
Html code for embedding videos on your blog
Text Comments (22)
jagadeesha kanihal (1 year ago)
how many levels of nesting does parallel support? parallel inside parallel inside parallel and so on....
Ole Tange (1 year ago)
Just tested on a 2 GB machine: It can do 150 levels before it runs out of memory.
Ole Tange (1 year ago)
It should only be limited by number of processes, command line length, and files open.
PS (1 year ago)
Can parallel run "N" jobs at the "most"? For example, I have tons of files to be "rsynced" but I want to pick them in blocks of 1G and post 1G on a group of hosts in "parallel".
Ole Tange (1 year ago)
From https://www.gnu.org/software/parallel/man.html#OPTIONS --jobs N -j N --max-procs N -P N Number of jobslots on each machine. Run up to N jobs in parallel. 0 means as many as possible. Default is 100% which will run one job per CPU core on each machine. The processes are not pinned. If you need that use `taskset`. Most will never need this.
PS (1 year ago)
Thanks Ole, this tool is amazing. One question - does parallel spawn "as many processes as cores are" or in other words if -j0 is given number processes spawned will be equal to that of number of cores? And are those processes pinned to specific core?
Ole Tange (1 year ago)
GNU Parallel runs N jobs in parallel. If you group your files in blocks of 1G you should be able to rsync them 1G at a time. Have a look at https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync
yogeshg1987 (2 years ago)
Holy shit, that's a large difference in run time: 6:01
b0rd3n (2 years ago)
Thank you for such amazing tool
Krystian Wojtas (3 years ago)
Very nice tool ! Thank you very much for that. I know it's all for education purpose, but to be strict, line like $ seq 1 10 | parallel -X echo mkdir test-{}.dir could be written easier in bash basing on it's specific feature $ echo mkdir test-{1..10}.dir
Hans-J. Schmid (3 years ago)
GNU Parallel is just awesome!!!
how to denote Sserver1,server2,server3,server4
Mark Ziemann (4 years ago)
@Pappagari RAGHAVENDRA REDDY use the "-S" switch. You need to have ssh set up without passwords.
Patrick Best (4 years ago)
fantastic for speeding a lengthy grep up.
xorbe2 (4 years ago)
This is 5 years old, how can it not be installed by default yet on a standard distro!!!
Ben Samuel (7 years ago)
@macias102: foo && bar means 'run foo, and if foo is successful, then run bar' It's the same as writing: if foo; then bar; fi There is another technique: foo || bar, which means "run foo, and if there's an error, run bar." foo || exit 1 is very common. It's like writing: if foo; then; else bar; fi Try doing this: ls; echo $?; cat file_that_doesnt_exist; echo $? You should see that $? shows the return code, a 0 indicates success and a non-zero value (usually 1) indicates an error.
Nathan Hargreaves (2 years ago)
If you use ";" it will run the next command regardless of the exit status of the last command. "&&" will ONLY run the next command if the last command was successful. They aren't the same.
Moriwaka Kazuo (8 years ago)
cool tool!
J B (8 years ago)
Great screencast and cool tool.
Peter Griffin (8 years ago)
I'm completely new at any of this, and i would like to know if you could help me get started.. I was wondering if there is a book, or something i could read to learn how to completely build and write script for a web page..The simplest form would work.
macias102 (8 years ago)
At the start, we you executed several commands, why did you use && instead of ; ? Thank you for the video.
daldous (8 years ago)
That was interesting and informative. Thank you for making it.

Would you like to comment?

Join YouTube for a free account, or sign in if you are already a member.