home rss search February 01, 2020

Youtube Downloader Perl Script


download videos and even convert them to mp3 or ogg

Normally, when you want to watch a Youtube video you have to use Adobe Flash or HTML5 in a browser like Firefox, Chrome (Chromium), Opera or Safari. Adobe Flash is incredibly inefficient, a huge security hole and notoriously buggy. Another issue is youtube's constant slow buffering speeds when limited by your ISP. It is frustrating if you are constantly waiting for the video to buffer in an interactive session.

An alternative is to use a script to download your videos for viewing later. Using curl may or may not be faster then flash in the browser depending on the youtube mirror you connect to. So, the idea is to pre-download a bunch of videos and when you are ready to watch them, they play at the highest quality with no youtube buffer pausing at all. We like to watch videos using VLC which uses less CPU time then flash and can play videos at higher than 1x speeds. In fact, we watch almost all of our vids at 2x (use the + key on the keypad on the right).

We also prefer a simpler scripted solution which does not rely on too many dependencies or proprietary code. For example, youtube-dl requires Python libraries and others require Javascript, Greesemonkey or even PHP to work in Firefox. Simple is better to audit and modify as needed.

Latest Version: 0.61

Advantages of a scripted solution

AdBlock Plus and NoScript safe, No Adobe Flash needed: To download the Youtube video just cut and paste the URL from the browser's URL bar. You do not need to start the video or anything. In fact, if you have NoScript you do not even have to allow java script on the Youtube site. You just need the link and paste it after the script. Simple and easy.

Unlimited download speed: There are no download rate limits with the URL we extract from the HTML page. The videos will download as fast as the Youtube cache server will send the data. The speed primarily depends on the popularity of the video; the most asked for videos are put on the highest bandwidth cache servers. Using a testing server on Linode (Xen VPS Hosting) we easily saw 24 megabytes per second downloads. The maximum upload and download limits on a Linode server is 45 megabytes per second. BTW, Linode gets a 10 out of 10 for overall quality in our opinion. On a side note, we have not found a reliable way of choosing the fastest mirror server. If your ISP has mirror servers they are normally slower then Google so we suggest finding their ips and blocking the ISP mirrors.

No Advertisements: The script will _not_ download any ads or advertisements like what you would see if you used the browser to watch flash videos. The reason is the ads are not part of the video, but an overlay Youtube inserts. We do not do anything to remove the ads at all. When you see an ad on Youtube the video is actually paused and the advertisement is overlaid on top of the video window. When the ad finishes the overlay is removed and video is unpaused.

Getting Started

The youtube download script is written in Perl and can be run on Linux, Mac OSX, OpenBSD, FreeBSD or any operating system supporting perl. Since we are using perl the UNIX shell you use, like bash, tcsh, csh or sh does not matter.

cURL, note the capitalization, is the only dependency you need to have installed on your system and the binary needs to be in your shell's $PATH. cURL is useful not only for this perl script, but for any scripting you may do on the future. The standard cURL package will allow you to download videos using HTTP or HTTPS. To make it easy we included the following lines to install cURL using your OS's package manager. You system may already have curl installed. Type "which curl" for the full path to the cURL binary.

## Ubuntu Linux
apt-get install curl

## Redhat Linux
yum install curl

## FreeBSD
pkg install curl

## OpenBSD 
pkg_add -i curl

The Youtube Download Perl script

To use the script, copy and paste the block of perl code from the following text box to a file. We are going to call the script youtube_download.pl for this example, but you can name it anything you like. Remember to make the file executable, "chmod 755 youtube_download.pl". Make sure the path to perl is correct on the first line, for example FreeBSD 10.2 is /usr/local/bin/perl insead of /usr/bin/perl . Type "which perl" for the full path to the perl binary.

#!/usr/bin/perl -T

use strict;
use warnings;

#
##  Calomel.org  ,:,  Download Youtube videos
##    Script Name : youtube_download.pl
##    Version     : 0.61
##    Valid from  : February 2020
##    URL Page    : https://calomel.org/youtube_wget.html
##    OS Support  : Linux, Mac OSX, OpenBSD, FreeBSD
#                `:`
## Two arguments
##    $1 Youtube URL from the browser
##    $2 prefix to the file name of the video (optional)
#

############  options  ##########################################

# Option: what file type do you want to download? The string is used to search
# in the youtube URL so you can choose mp4, webm, avi or flv.  mp4 is the most
# compatable and plays on android, ipod, ipad, iphones, vlc and mplayer.
my $fileType = "mp4";

# Option: what visual resolution or quality do you want to download? List
# multiple values just in case the highest quality video is not available, the
# script will look for the next resolution. You can choose "itag=22" for 720p,
# "itag=18" which means standard definition 640x380 and "itag=17" which is
# mobile resolution 144p (176x144). The script will always prefer to download
# the first listed resolution video format from the list if available.
#my $resolution = "itag=22,itag=18";
my $resolution = "itag=22";

# Option: How many times should the script retry if the download fails?
my $retryTimes = 2;

# Option: turn on DEBUG mode. Use this to reverse engineering this code if you are
# making changes or you are building your own youtube download script.
my $DEBUG=0;

#################################################################

# initialize global variables and sanitize the path
$ENV{PATH} = "/bin:/usr/bin:/usr/local/bin:/opt/local/bin";
my $prefix = "";
my $retry = 1;
my $retryCounter = 0;
my $user_url = "";
my $user_prefix = "";

# collect the URL from the command line argument
chomp($user_url = $ARGV[0]);
my $url = "$1" if ($user_url =~ m/^([a-zA-Z0-9\_\-\&\?\=\:\.\/]+)$/ or die "\nError: Illegal characters in YouTube URL\n\n" );

# declare the user defined file name prefix if specified
if (defined($ARGV[1])) {
   chomp($user_prefix = $ARGV[1]);
   $prefix = "$1" if ($user_prefix =~ m/^([a-zA-Z0-9\_\-\.\ ]+)$/ or die "\nError: Illegal characters in filename prefix\n\n" );
}

# if the url down below does not parse correctly we start over here
tryagain:

# make sure we are not in a tryagain loop by checking the counter
if ( $retryTimes < $retryCounter ) {
   print "\n\n Stopping the loop because the retryCounter has exceeded the retryTimes option.";
   print "\n The video may not be available at the requested resolution or may be copy protected.\n\n";
   print "\nretryTimes counter = $retryTimes\n\n" if ($DEBUG == 1);
   exit;
}

# download the html from the youtube page containing the page title and video
# url. The page title will be used for the local video file name and the url
# will be sanitized to download the video.
my $html = `curl -sS -L --compressed -A "Mozilla/5.0 (compatible)" "$url"`  or die  "\nThere was a problem downloading the HTML page.\n\n";

# format the title of the page to use as the file name
my ($title) = $html =~ m/<title>(.+)<\/title>/si;
$title =~ s/[^\w\d]+/_/g or die "\nError: we could not find the title of the HTML page. Check the URL.\n\n";
$title = lc ($title);
$title =~ s/_youtube//ig;
$title =~ s/^_//ig;
$title =~ s/_amp//ig;
$title =~ s/_39_s/s/ig;
$title =~ s/_quot//ig;

# filter the URL of the video from the HTML page
my ($download) = $html =~ /"loaderUrl"(.*)/ig;

# Print the raw separated strings in the HTML page
#print "\n$download\n\n" if ($DEBUG == 1);

# This is where we loop through the HTML code and select the file type and
# video quality. 
my @urls = split(',', $download);
OUTERLOOP:
foreach my $val (@urls) {
#   print "\n$val\n\n";

    if ( $val =~ /$fileType/ ) {
       my @res = split(',', $resolution);
       foreach my $ress (@res) {
         if ( $val =~ /$ress/ ) {
         print "\n  html to url separation complete.\n\n" if ($DEBUG == 1);
         print "$val\n" if ($DEBUG == 1);
         $download = $val;
         last OUTERLOOP;
         }
       }
    }
}

# clean up by translating url encoding and removing unwanted strings
print "\n  Start regular expression clean up...\n" if ($DEBUG == 1);
$download =~ s/\%([A-Fa-f0-9]{2})/pack('C', hex($1))/seg;
$download =~ s/\\u0026/\&/g;
$download =~ s/(type=[^&]+)//g;
$download =~ s/(fallback_host=[^&]+)//g;
$download =~ s/(quality=[^&]+)//g;
$download =~ s/&+/&/g;
$download =~ s/&$//g;
$download =~ s/%2C/,/g;
$download =~ s/%252F/\//g;
$download =~ s/^:"url=//g;
$download =~ s/\"//g;
$download =~ s/\?itag=22&/\?/;
$download =~ s/\\url\:\\//g;
$download =~ s/\\//g;

# print the URL before adding the page title.
print "\n  The download url string: \n\n$download\n" if ($DEBUG == 1);

# check for &itag instances and either remove extras or add an additional
my $counter1 = () = $download =~ /&itag=\d{2,3}/g;
print "\n  number of itag= (counter1): $counter1\n" if ($DEBUG == 1);
if($counter1 > 1){ $download =~ s/&itag=\d{2,3}//; }

# save the URL starting with http(s)... 
my ($youtubeurl) = $download =~ /(https?:.+)/;

# is the URL in youtubeurl the variable? If not, go to tryagain above.
if (!defined $youtubeurl) {
    print "\n URL did not parse correctly. Let's try another mirror...\n";
    $retryCounter++;
    sleep 2;
    goto tryagain;
}

# collect the title of the page
my ($titleurl) = $html =~ m/<title>(.+)<\/title>/si;
$titleurl =~ s/ - YouTube//ig;

# combine file variables into the full file name
my $filename = "unknown";
$filename = "$prefix$title.$fileType";

# url title to url encoding. all special characters need to be converted
$titleurl =~ s/([^A-Za-z0-9\+-])/sprintf("%%%02X", ord($1))/seg;

# combine the youtube url and title string
$download = "$youtubeurl\&title=$titleurl";

# Process check: Are we currently downloading this exact same video? Two of the
# same download processes will overwrite each other and corrupt the file.
my $running = `ps auwww | grep [c]url | grep -c "$filename"`;
print "\n  Is the same file name already being downloaded? $running" if ($DEBUG == 1);
if ($running >= 1)
  {
   print "\n  Already $running process, exiting." if ($DEBUG == 1);
   exit 0;
  };

# Print the long, sanitized youtube url for testing and debugging
print "\n  The following url will be passed to curl:\n" if ($DEBUG == 1);
print "\n$download\n" if ($DEBUG == 1);

# print the file name of the video being downloaded for the user 
print "\n Download:   $filename\n\n" if ($retryCounter == 0 || $DEBUG == 1);

# print the itag quantity for testing
my $counter2 = () = $download =~ /&itag=\d{2,3}/g;
print "\n  Does itag=1 ?  $counter2\n\n" if ($DEBUG == 1);
if($counter2 < 1){
 print "\n URL did not parse correctly (itag).\n";
 exit;
}

# Background the script before the download starts. Use "ps" if you need to
# look for the process running or use "ls -al" to look at the file size and
# date.
fork and exit;

# Download the video, resume if necessary
system("curl", "-sSRL", "-A 'Mozilla/5.0 (compatible)'", "-o", "$filename", "--retry", "5", "-C", "-", "$download");

# Print the exit error code
print "\n  exit error code: $?\n" if ($DEBUG == 1);

# Exit Status: Check if the file exists and we received the correct error code
# from the curl system call. If the download experienced any problems the
# script will run again and try to continue the download until the retryTimes
# count limit is reached.

if( $? == 0 && -e "$filename" && ! -z "$filename" )
   {
      print "\n  Finished: $filename\n\n" if ($DEBUG == 1);
   }
 else
   {
      print STDERR "\n  FAILED: $filename\n\n" if ($DEBUG == 1);
      $retryCounter++;
      sleep $retryCounter;
      goto tryagain;
   }

#### EOF #####

How do I use the script ?

Once you have the script setup you just need to find a Youtube video. We chose a video from Bethesda Softworks for Fallout 4. Execute the script with the youtube URL copy and pasted from the browser's URL bar. Make note that you can add one more argument to the end of the command line to add a prefix to the file name. Here is an example of both options; notice the change in files names as the second example has "fo4_" as the file name prefix. Also note some of the URLS through youtube have ampersands "&" in them. For these types of URL's just use double quotes around the url so your shell passes the full string into the script.

## Example 1: Here we just pass the youtube URL
#
user@machine$ ./youtube_download.pl "https://www.youtube.com/watch?v=0YnyQTSALCE"

   Download: fallout_4_the_wanderer_trailer.mp4


## Example 2: Here we pass the Youtube URL and the file name prefix "fo4_"
#
user@machine$ ./youtube_download.pl "https://www.youtube.com/watch?v=0YnyQTSALCE" fo4_

   Download: fo4_fallout_4_the_wanderer_trailer.mp4

The video will download in the background and save to your current directory. You can play the file with your favorite video player, we prefer VLC for example.

Items to keep in mind...

File name is the same as the name of the web page: Notice the file name is the same as the title of the Youtube web page. We have also scrubbed the title to take out all special characters and reduce all letters to lower case. This makes it easier read and to run on the command line.

Save Location: The video will be saved in your current directory.

Script methodology: The download will run in the background. You can start as many of these downloads as you want. We have started as many as a dozen simultaneous downloads without issue. The script will finish silently; meaning when the download is finished you will not get any notification.

Process state: You can check if the download is running by looking at the process list (ps) and grep'ing for curl. Something like, ps -aux | grep -i curl will work. At this point there is no way to tell how fast the download is going. What you can do is look at the file size change using ls -la and estimate from there. You can start watching the video file right away too. The file is downloaded serially, so as soon as the file starts downloading you should be able to start VLC if want to watch the video right away.

Video file type: The video will download in the file type you specify. mp4 seems to be the most compatible type, but WebM which is also called VP8 and AVI are available. WebM is a digital multimedia container file format promoted by the open-source WebM Project headed by Google. It comprises a subset of the Matroska multimedia container format. If you current media player does not support webm then you need a codec for your OS. Just search on Google for "webm codec" and you should get pointed in the right direction. Note, you can play this format with the VLC media player which is available on all OS's. VLC is a free and open source cross-platform multimedia player and framework that plays most multimedia files as well as DVD, Audio CD, VCD, and various streaming protocols. VLC can also play the videos at greater than 1x speed by hitting the plus "+" key on the keypad. When playing videos at anything faster than 1x the sound will be automatically pitch corrected.

Always use the latest script version: Youtube changes the format of their HTML pages every once in a while which consequently breaks download scripts like what we have here. The average amount of time between HTML format changes is three(3) months. If you find this script no longer works make sure to check back on this page for any updates. We will do our best to keep this command line option working since we use this script at least once a day. Make note, at the top of the script we have the version number and date the script is good from. We will also post on the RSS feed (link at the top of the page) when a new version is available.

How do I make an audio mp3 or ogg from a youtube video ?

At some point you will want to save off the sound from a video. A good case is downloading an instructional video and listening to it on your music player like an andoid phone or apple device. We like to download class videos from the Massachusetts Institute of Technology (MIT) and listen to them in the car.

You can use the above download script to get a video and convert the video's soundtrack to MP3 (or OGG or any other) format using avconv. You will need to install avconv in order to extract the audio. For Ubuntu use "apt-get install libav-tools". OpenBSD "pkg_add -i ffmpeg" and FreeBSD use "pkg_add -r ffmpeg" instead if avconv is not available as a package. For Mac OSX you may want to look at ffmpegX.

## Convert the audio from a Youtube video to mp3 or ogg, audio only.

## download the video. (Same link to the Fallout 4 video as above)
user@machine$ ./youtube_download.pl "https://www.youtube.com/watch?v=0YnyQTSALCE"
 Downloading:  fallout_4_the_wanderer_trailer.mp4

## Convert video to mp3, audio only
user@machine$ avconv -i fallout_4_the_wanderer_trailer.mp4 -vn -ab 128 fallout_4_the_wanderer_trailer.mp3

## Convert video to ogg, audio only
user@machine$ avconv -i fallout_4_the_wanderer_trailer.mp4 -vn -ab 128 fallout_4_the_wanderer_trailer.ogg

## Extract video's audio to aac, audio only (best quality)
user@machine$ avconv -i fallout_4_the_wanderer_trailer.mp4 -map 0:1 -c:a copy fallout_4_the_wanderer_trailer.aac

Questions?

How about an Android device app ?

Yes, if you have an Android phone or tablet then check out the YouTube Downloader for Android . You can choose the format of the video from mp4, to avi to webm and even download 60fps videos! Note: the app is not in the Google store yet, but you can download it directly from dentex's blog.

This script only supports youtube, are you going to support other sites? No, we only use youtube at this time. For download support of other sites you should check out youtube-dl. youtube-dl supports almost 20 different video sites including youtube. You might even be able to install youtube-dl using apt-get or yum.

I have a patch or the script is brokeYou are welcome to mail us. Please make sure you look at any errors the script outputs to see if you can see the cause of the error. The contact link is at the bottom of this page.


Contact Us RSS Feed Google Site Search