 | Level: Intermediate Teodor Zlatanov (tzz@bu.edu), Programmer, Gold Software Systems
11 Dec 2003 Every self-respecting computer and music fan needs to be able to manipulate MP3s -- the defacto standard for recreational digital music use. In this article, Ted looks at ways to manage and manipulate MP3s (searching, tagging, renaming, commenting, etc.) using the autotag.pl application. Ted takes you through the application, illustrating how CPAN modules enable the application.
Manipulating MP3 files is a necessity for computer-savvy music lovers
today. Although other formats exist and are flourishing, this article will
concentrate on the MP3 format because it is by all appearances the most
popular one today. However, the general approaches shown will work with
other music file formats that allow tags as well. In fact, many file
formats that use tags could benefit from an application like mine, autotag.pl.
I welcome your suggestions.
This discussion in this article will be on Perl issues in general, manipulating MP3 files in particular, and the autotag.pl application specifically.
I used the MP3::Tag and WebService::FreeDB CPAN modules only, even
though the MP3::Info, MP3::ID3Lib, MusicBrainz::Client, and
AudioFile::Identify::MusicBrainz modules also exist and can be useful.
The primary reason why MP3::ID3Lib was not used was because it requires
the id3lib software (see Resources). While MP3::Info is pure Perl and
simple to install, I found MP3::Tag more powerful. MusicBrainz::Client and
AudioFile::Identify::MusicBrainz were not used because MusicBrainz appears
to be a less comprehensive database of released CDs than FreeDB. In the
end, the choice of ID3 tagging module and track information module is up
to you. My experience, painfully gained through trial and error, is that
MP3::Tag and WebService::FreeDB will serve you best.
I made the choice not to use the CDDB (Gracenote) disc database, even
though it is very comprehensive. Gracenote is a company that keeps
proprietary databases of CD track lists (only searching -- no wholesale
downloading -- of those databases is allowed). Quite a bit of those
databases' contents were contributed by volunteers in the early days when
Gracenote was just CDDB. FreeDB is a volunteer effort organized to
provide a free, unrestricted database of CD tracklists. The entire
contents of the FreeDB databases are available for download without any
copyright restrictions -- so you could set up your own FreeDB server if you
wanted.
The modules that I did not use were not necessarily inferior, so if
you like you can use them. I simply liked MP3::Tag and WebService::FreeDB
better based on personal experience with them and for the reasons above.
The actual reading and writing of tags is abstracted in functions, so you
won't have to change a lot if you use a different module for MP3 tag
reading and writing.
I should also mention that the Term::ReadLine::Gnu CPAN module works
better for me than the default module, Term::ReadLine::Perl, in Linux inside xterm and Eterm terminal emulators. You may want to install it on
top of Term::ReadLine if you notice strange behavior at the prompts that
expect text.
 | |
A word about MP3 tags
First, there was music. Then came computers. Computers were slow,
and they beeped. Even with such sad tools as the PC speaker (oh, how
jealous I was of Apple and Amiga users), programs were written to produce
music for games and entertainment. Then came better and better sound
cards, and office walls around the world now shake with surround-sound and
THX-certified speakers.
In parallel with these hardware developments came a multitude of sound
formats. There was .mid for MIDI melodies, .voc, .mod, .wav, and so on.
The proprietary MP3 format, which involves many patents owned by the
German Fraunhofer institute, became popular over time -- it offered decent
compression and performance. There are formats other than MP3, notably
Ogg Vorbis, but today MP3 still appears to be the top choice for music
storage.
One nice thing about MP3 files was that they could be tagged with ID3
tags. Inside the file was information about it -- what's commonly known as
metadata. The album, artist, track name, comments, and (with ID3 version
1.1) even the track number could be stored in the ID3 tag as long as they
were under a certain limit of characters.
The successor to ID3 version 1.1 was ID3 version 2 (ID3v2 for short),
which is much better in almost every aspect except simplicity. ID3v2 can
handle multiple languages, store arbitrarily long data in each tag
element, and even store pictures as part of the tag. Unfortunately,
dealing with ID3v2 involves learning that TALB is the album name, and TIT2
is the track number. It makes one long for the Ogg Vorbis format, where
the artist tag element is called...wait for it...ARTIST! (To be fair,
this is just a convention -- Ogg Vorbis comments are as free-form as you
want to make them.) Unfortunately, the billions of MP3 files in existence
can't be converted without loss of quality to Ogg Vorbis or any other
format, so at the very least the next five years will find us dealing with
MP3 files in addition to whatever the next "hot" format is.
I have tried very hard to abstract tags as content from the actual ID3
tags. It will be easy, when the time comes, to modify autotag.pl so it
will handle other tagging formats besides ID3.
|
|
Fundamental autotag.pl functions
There were a few things in autotag.pl that I put in separate
functions. First of all, contains_word_char() is a function that makes
the decision whether some text contains a word (\w in Perl) character.
It works correctly with undefined values as well, whereas with warnings
turned on, a regular expression match on an undefined value will print a
warning. It's primarily useful because it doesn't show a warning; in order
to achieve that effect without a function you'd have to check whether the
string is defined every time.
Listing 1. The contains_word_char() function
# {{{ contains_word_char: return 1 if the text contains a word character
sub contains_word_char
{
my $text = shift @_;
return $text && length $text && $text =~ m/\w/;
}
# }}}
|
Next come the input routines. These are pretty verbose, and they
attempt to handle most cases of user interaction the program will need.
Listing 2. The get_tag() function
# {{{ get_tag: get a ID3 V2 tag, using V1 if necessary
sub get_tag
{
my $file = shift @_;
my $upgrade = shift @_;
my $mp3 = MP3::Tag->new($file);
return undef unless defined $mp3;
$mp3->get_tags();
my $tag = {};
if (exists $mp3->{ID3v2})
{
my $id3v2 = $mp3->{ID3v2};
my $frames = $id3v2->supported_frames();
while (my ($fname, $longname) = each %$frames)
{
# only grab the frames we know
next unless exists $supported_frames{$fname};
$tag->{$fname} = $id3v2->get_frame($fname);
delete $tag->{$fname} unless defined $tag->{$fname};
$tag->{$fname} = $tag->{$fname}->{Text} if $fname eq 'COMM';
$tag->{$fname} = $tag->{$fname}->{URL} if $fname eq 'WXXX';
$tag->{$fname} = '' unless defined $tag->{$fname};
}
}
elsif (exists $mp3->{ID3v1})
{
warn "No ID3 v2 TAG info in $file, using the v1 tag";
my $id3v1 = $mp3->{ID3v1};
$tag->{COMM} = $id3v1->comment();
$tag->{TIT2} = $id3v1->song();
$tag->{TPE1} = $id3v1->artist();
$tag->{TALB} = $id3v1->album();
$tag->{TYER} = $id3v1->year();
$tag->{TRCK} = $id3v1->track();
$tag->{TIT1} = $id3v1->genre();
if ($upgrade && read_yes_no("Upgrade ID3v1 tag to ID3v2 for $file?", 1))
{
set_tag($file, $tag);
}
}
else
{
warn "No ID3 TAG info in $file, creating it";
$tag = {
TIT2 => '',
TPE1 => '',
TALB => '',
TYER => 9999,
COMM => '',
};
}
print "Got tag ", Dumper $tag
if $config->DEBUG();
return $tag;
}
# }}}
|
The only slightly unusual function is read_yes_no(), which can be
given a Y or 1 default parameter to make the default true, and any
other parameter to make the default false. Thus, I can make the
read_yes_no() function accept different default values when the user
presses Enter or Space. In addition, the Backspace or Delete keys will
reverse the default. It's not flashy code, but it's very useful.
autotag.pl preliminaries
The autotag.pl application begins with some initialization routines.
Listing 3. Initialization
use constant SEARCH_ALL => 'all';
my %freedb_searches = (
artist => { keywords => [], abbrev => 'I', tagequiv => 'TPE1' },
title => { keywords => [], abbrev => 'T', tagequiv => 'TALB' },
track => { keywords => [], abbrev => 'K', tagequiv => 'TIT2' },
rest => { keywords => [], abbrev => 'R', tagequiv => 'COMM' },
);
# maps ID3 v2 tag info to WebService::FreeDB info
my %info2freedb = (
TALB => 'cdname',
TPE1 => 'artist',
);
my %supported_frames = (
TIT1 => 1,
TIT2 => 1,
TRCK => 1,
TALB => 1,
TPE1 => 1,
COMM => 1,
WXXX => 1,
TYER => 1,
);
my @supported_frames = keys %supported_frames;
my $term = new Term::ReadLine 'Input> '; # global input
|
The SEARCH_ALL constant is what I use when the user wants to search
for a word everywhere -- track names, artist names, etc. I made it a
constant in case anyone wants to change it to something else, but it could
have been hard-coded as "all" as well.
The %freedb_searches hash maps FreeDB fields to information about
them, including ID3v2 tag elements. For instance, it says that what
FreeDB calls "artist" is known as "TPE1" in an MP3 tag. The "abbrev" field
in the hash entry is used to define command-line switches, so later I can
define an -artist switch that can be abbreviated to -i based on the
%freedb_searches information.
The %info2freedb hash maps FreeDB fields common across all tracks in a
disc to ID3v2 fields. These are not the fields in %freedb_searches, this
is a different mapping that says that "cdname" and "artists," also known as
"TALB" and "TPE1," respectively, are the same for all tracks in an album.
The %supported_frames hash and the @supported_frames list will be used
to figure out what ID3v2 tag elements I support. I could have generated
the hash from the list instead of getting the list from the hash, but I
feel the difference is irrelevant. The supported frames are used for mass
tagging and when writing ID3v2 tags (I only modify the supported frames).
Finally, I create a Term::ReadLine object for user input throughout the application.
Next, I initialize the AppConfig options. Bear with me, this is
useful.
Listing 4. AppConfig initialization
# {{{ set up AppConfig and process -help
my $config = AppConfig->new();
$config->define(
DEBUG =>
{ ARGCOUNT => ARGCOUNT_ONE, DEFAULT => 0, ALIAS => 'D' },
CONFIG_FILE =>
{ ARGCOUNT => ARGCOUNT_ONE, DEFAULT => 0, ALIAS => 'F' },
HELP =>
{ ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => 'H' },
DUMP =>
{ ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0 },
ACCEPT_ALL =>
{ ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => 'C' },
DRYRUN =>
{ ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => 'N' },
GUESS_TRACK_NUMBERS_ONLY =>
{ ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => 'G' },
STRIP_COMMENT_ONLY =>
{ ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => 'SC' },
MASS_TAG_ONLY =>
{ ARGCOUNT => ARGCOUNT_HASH, ALIAS => 'M' },
RENAME_ONLY =>
{ ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => 'RO' },
RENAME_MAX_CHARS =>
{ ARGCOUNT => ARGCOUNT_ONE, DEFAULT => 30},
RENAME_FORMAT =>
{ ARGCOUNT => ARGCOUNT_ONE, DEFAULT => '%a-%t-%n-%c-%s.mp3'},
RENAME_BADCHARS =>
{ ARGCOUNT => ARGCOUNT_LIST, ALIAS => 'RB' },
RENAME_REPLACECHARS =>
{ ARGCOUNT => ARGCOUNT_LIST, ALIAS => 'RR' },
RENAME_REPLACEMENT =>
{ ARGCOUNT => ARGCOUNT_ONE, DEFAULT => '_' },
FREEDB_HOST =>
{ ARGCOUNT => ARGCOUNT_ONE, DEFAULT => 'http://www.freedb.org', },
OR =>
{ ARGCOUNT => ARGCOUNT_NONE, DEFAULT => '0', },
SEARCH_ALL() =>
{ ARGCOUNT => ARGCOUNT_LIST, ALIAS => 'A' },
);
foreach my $search (keys %freedb_searches)
{
$config->define($search => {
ARGCOUNT => ARGCOUNT_LIST,
ALIAS => $freedb_searches{$search}->{abbrev},
});
}
$config->args();
$config->file($config->CONFIG_FILE())
if $config->CONFIG_FILE();
unless (scalar @{$config->RENAME_BADCHARS()})
{
push @{$config->RENAME_BADCHARS()}, split(//, "\"`!'?&[]()/;\n\t");
}
unless (scalar @{$config->RENAME_REPLACECHARS()})
{
push @{$config->RENAME_REPLACECHARS()}, split(//, " ");
}
if ($config->HELP())
{
print <<EOHIPPUS;
$0 [options] File1.mp3 File2.mp3 ...
Options:
-help (-h) : print this help
-config_file (-f) N : use this config file, see AppConfig module docs for format
-debug (-d) N : print debugging information (level N, 0 is lowest)
-dump : just dump the list of albums and tracks within them
-dryrun (-n) : do everything but modify the MP3 files
-freedb_host H : set the FreeDB host, default "www.freedb.org"
-or : search for keyword A or keyword B, not A and B as usual
-accept_all (c) : accept all search results for consideration for each file,
also accept all renames without asking
-rename_badchars (-rb) A -rb B : characters A and B to remove when renaming
-rename_replacechars (-rr) A -rr B : characters A and B to replace
when renaming
-rename_maxchars N : use at most this many characters from a tag
element when renaming, default: ${\$config->RENAME_MAX_CHARS()}
-rename_replacement X : character to use when replacing,
default: [${\$config->RENAME_REPLACEMENT()}]
-rename_format (-f) F : format for renaming; default "${\$config->RENAME_FORMAT()}"
%a -> Artist
%t -> Track number
%n -> Album name
%c -> Comment
%s -> Song title
-guess_track_numbers_only (-g) : guess track numbers using the file
name, then exit
-rename_only (-ro) : rename tracks using the given format (see
-rename_format), then exit
-mass_tag_only (-m) A=X -m B=Y : mass-tag files (tag element A is X,
B is Y), then exit (tag elements
available: @supported_frames)
-strip_comment_only (-sc) : strip comments and URLs, then exit
Repeatable options (you can specify them more than once, K is the keyword):
-all (-a) K : search everywhere
-artist (-i) K : search for these artists
-title (-t) K : search for these titles
-track (-k) K : search for these tracks
-rest (-r) K : search for these keywords everywhere else
Note that the repeatable options are cumulative, so artist A and title
B will produce matches for A and B, not A or B. In the same way,
artist A and artist B will produce matches for A and B, not A or B.
If you want to match A or B terms, use -or, for instance:
$0 -or -artist "pink floyd" -artist "fred flintstone"
EOHIPPUS
exit;
}
# }}}
|
Yes, all that code just initialized the command-line options. With
AppConfig, those options can be used and modified throughout the program;
there are many benefits to using AppConfig that are outside the scope of
this article (see Resources for more information
on AppConfig).
Also, I use the entries in the %freedb_searches hash to create the
appropriate configuration options, which makes life easier for the user
and for the programmer. I can also use the entries in the
%freedb_searches hash to create the appropriate configuration options.
After loading a configuration file, if the user specified it, I
populate the character replacement and bad character arrays with a
sensible default.
Finally, I handle the -help switch. Note how the default values for
various options are printed inside the help text, using variable
interpolation. This makes for a very readable help message. I always
update my help message right after I added a feature, and sometimes even
before. I believe that help should be synchronized with the functionality
of the program, otherwise the program is confusing and the help is
misleading. The autotag.pl program in particular needs more documentation
-- a POD-style documentation would be nice, and that may be in place by the
time you read this article. POD documentation is a part of the script, so
downloading autotag.pl (see Resources) will
include the POD documentation if I have written it already.
ID3v2 tag-related functions
The get_tag() function is essential to autotag.pl. Given an MP3 file
name, it builds a hash tag from the file. If the tag is only ID3v1,
get_tag() will offer to upgrade the ID3 tag for free (what a deal!). If
there is no ID3 tag, get_tag() will create one. Furthermore, get_tag()
knows to look at the Text and URL sub-elements of the COMM and WXXX tag
elements, respectively.
Listing 5. The get_tag() function
# {{{ get_tag: get a ID3 V2 tag, using V1 if necessary
sub get_tag
{
my $file = shift @_;
my $upgrade = shift @_;
my $mp3 = MP3::Tag->new($file);
return undef unless defined $mp3;
$mp3->get_tags();
my $tag = {};
if (exists $mp3->{ID3v2})
{
my $id3v2 = $mp3->{ID3v2};
my $frames = $id3v2->supported_frames();
while (my ($fname, $longname) = each %$frames)
{
# only grab the frames we know
next unless exists $supported_frames{$fname};
$tag->{$fname} = $id3v2->get_frame($fname);
delete $tag->{$fname} unless defined $tag->{$fname};
$tag->{$fname} = $tag->{$fname}->{Text} if $fname eq 'COMM';
$tag->{$fname} = $tag->{$fname}->{URL} if $fname eq 'WXXX';
$tag->{$fname} = '' unless defined $tag->{$fname};
}
}
elsif (exists $mp3->{ID3v1})
{
warn "No ID3 v2 TAG info in $file, using the v1 tag";
my $id3v1 = $mp3->{ID3v1};
$tag->{COMM} = $id3v1->comment();
$tag->{TIT2} = $id3v1->song();
$tag->{TPE1} = $id3v1->artist();
$tag->{TALB} = $id3v1->album();
$tag->{TYER} = $id3v1->year();
$tag->{TRCK} = $id3v1->track();
$tag->{TIT1} = $id3v1->genre();
if ($upgrade && read_yes_no("Upgrade ID3v1 tag to ID3v2 for $file?", 1))
{
set_tag($file, $tag);
}
}
else
{
warn "No ID3 TAG info in $file, creating it";
$tag = {
TIT2 => '',
TPE1 => '',
TALB => '',
TYER => 9999,
COMM => '',
};
}
print "Got tag ", Dumper $tag
if $config->DEBUG();
return $tag;
}
# }}}
|
The set_tag() function is the sibling of get_tag(). It writes a ID3v2
tag, observing the COMM and WXXX frames' sub-elements. It takes a hash
reference such as get_tag() might produce.
Listing 6. The set_tag() function
# {{{ set_tag: set a ID3 V2 tag on a file
sub set_tag
{
my $file = shift @_;
my $tag = shift @_;
my $mp3 = MP3::Tag->new($file);
print Dumper $tag;
my $tags = $mp3->get_tags();
my $id3v2;
if (ref $tags eq 'HASH' && exists $tags->{ID3v2})
{
$id3v2 = $tags->{ID3v2};
}
else
{
$id3v2 = $mp3->new_tag("ID3v2");
}
my %old_frames = %{$id3v2->get_frame_ids()};
foreach my $fname (keys %$tag)
{
$id3v2->remove_frame($fname)
if exists $old_frames{$fname};
if ($fname eq 'WXXX')
{
$id3v2->add_frame('WXXX', 'ENG', 'FreeDB URL', $tag->{WXXX}) ;
}
elsif ($fname eq 'COMM')
{
$id3v2->add_frame('COMM', 'ENG', 'Comment', $tag->{COMM}) ;
}
else
{
$id3v2->add_frame($fname, $tag->{$fname});
}
}
$id3v2->write_tag();
return 0;
}
# }}}
|
The print_tag_info() function simply prints out a summary of the tag.
Unlike Data::Dumper, which I've used elsewhere in autotag.pl (sometimes
needlessly, I must say), print_tag_info() provides a nice, user-oriented
printout of the hash tag elements. Note that this function takes a hash
reference, not an actual file name.
The guess_track_number() and guess_artist_and_track() functions do the
best they can, given a file name and possibly some ID3 tag information.
Note that guess_track_number() understands that track numbers are very rarely higher than 30.
Listing 7. The print_tag_info(), guess_track_number(), and guess_artist_and_track() functions
# {{{ print_tag_info: print the tag info
sub print_tag_info
{
my $filename = shift @_;
my $tag = shift @_;
my $extra = shift @_ || 'Track info';
# argument checking
return unless ref $tag eq 'HASH';
print "$extra for '$filename':\n";
foreach (keys %$tag)
{
printf "%10s : %s\n", $_, $tag->{$_};
}
}
# }}}
# {{{ guess_track_number: guess track number from ID3 tag and file name
sub guess_track_number
{
my $filename = shift @_;
my $tag = shift @_ || return undef;
$filename = basename($filename); # directories can contain confusing data
# first try to guess the track number from the old tag
if (exists $tag->{TRCK} && contains_word_char($tag->{TRCK}))
{
my $n = $tag->{TRCK} + 0; # fix tracks like 1/10
return $n;
}
elsif ($filename =~ m/([012]?\d).*\.[^.]+$/)
# now look for numbers in the filename (0 through 29)
{
print "Guessed track number $1 from filename '$filename'\n"
if $config->DEBUG();
return $1;
}
return undef; # if all else fails, return undef
}
# }}}
# {{{ guess_artist_and_track: guess artist and track from file name
sub guess_artist_and_track
{
my $filename = shift @_;
my $artist;
my $track;
$filename = basename($filename); # directories can contain confusing data
if ($filename =~ m/([^-_]{3,})\s*-\s*(.{3,})\s*\.[^.]+$/)
{
print "Guessed artist $1 from filename '$filename'\n"
if $config->DEBUG();
$artist = $1;
$track = $2;
}
return ($artist, $track);
}
# }}}
|
I use the data returned from the FreeDB search to make an anonymous
hash with the appropriate elements. The mapping between
WebService::FreeDB fields and ID3v2 tag elements is tentative, but it has
worked very well for me.
Listing 8. The make_tag_from_freedb() function
# {{{ make_tag_from_freedb: make the ID3 tag info from a FreeDB entry
sub make_tag_from_freedb
{
my $disc = shift @_;
my $track = shift @_;
# argument checking
return undef unless $track =~ m/^\d+$/;
# note that the user inputs track "1" but WebService::FreeDB gives us that
# track at position 0, so we decrement $track
$track--;
return undef unless exists $disc->{trackinfo};
return undef unless exists $disc->{trackinfo}->[$track];
my $track_data = $disc->{trackinfo}->[$track];
return {
TIT1 => $disc->{genre},
TIT2 => $track_data->[0],
TRCK => $track+1,
TPE1 => $disc->{artist},
TALB => $disc->{cdname},
TYER => $disc->{year},
WXXX => $disc->{url},
COMM => $disc->{rest}||'',
};
}
# }}}
|
Mass tagging, mass renaming, stripping comments, and guessing track numbers
The main functionality of autotag.pl is to identify MP3 files. In the
course of that process, however, minor adjustments often need to be made
to large groups of files. Enter the Four Autotagging Horsemen.
Stripping comments is a very simple process.
I get a hash tag with get_tag(), empty the
COMM and WXXX fields, and write it back with set_tag(). In fact, comment
stripping could have been done through mass tagging, but it's used so
often that I felt I needed a separate option for it.
Guessing track numbers is also quite simple. Get the hash tag, use
guess_track_number() on the file and the hash tag, ask for confirmation,
and write the tag back to the file.
Mass tagging operates on multiple keys (e.g. TALB) on a series of
files. You say, for instance,
autotag.pl -mt "TALB=Best" *.mp3
and all the files that have the mp3 extension will be assigned that
TALB value in their ID3v2 tag. Mass-tagging is very nice when, for
example, you have a directory full of music by an artist and want to tag
all that music with the artist's name. Only supported tag elements can be
mass-tagged. Again, I get the hash tag, make my changes, and write it
back. The goal is to make it simple and easy to maintain.
Listing 9. Mass tagging, comment stripping, and guessing track numbers
# {{{ handle the one-shot options
if ($config->GUESS_TRACK_NUMBERS_ONLY() ||
$config->STRIP_COMMENT_ONLY() ||
scalar keys %{$config->MASS_TAG_ONLY()})
{
foreach my $file (@ARGV)
{
my $tag = get_tag($file, 1);
unless (defined $tag)
{
warn "No ID3 TAG info in '$file', skipping";
next;
}
next if $config->DRYRUN();
# delegate stripping comments to the mass tagging function
if ($config->STRIP_COMMENT_ONLY())
{
$config->MASS_TAG_ONLY()->{COMM} = '';
$config->MASS_TAG_ONLY()->{WXXX} = '';
}
if (scalar keys %{$config->MASS_TAG_ONLY()})
{
foreach (keys %{$config->MASS_TAG_ONLY()})
{
unless (exists $supported_frames{$_})
{
warn "Unsupported tag element $_ requested for mass tagging, skipping";
next;
}
$tag->{$_} = $config->MASS_TAG_ONLY()->{$_};
}
set_tag($file, $tag);
}
else
{
my $track_number_guess = guess_track_number($file, $tag);
next if $config->DRYRUN();
if (defined $track_number_guess &&
read_yes_no("Is track number $track_number_guess OK for '$file'?", 1))
{
$tag->{TRCK} = $track_number_guess;
set_tag ($file, $tag);
}
else
{
warn "Could not guess a track number for file $file, sorry";
}
}
}
exit 0;
}
# }}}
|
Ah, the mass renaming option. I left it for last because it's the
most complex one. For each renaming parameter, I make each "%" in the tag
value appear as "{{{%}}}" because otherwise, those "%" characters, when
followed by one of the special renaming parameters, could be
misinterpreted. Take "100%true" for instance, for the track name, and
see how it would become "100%TRACKNAMErue" instead, where TRACKNAME is the
track name I get from the hash tag.
Mass renaming also eliminates bad characters, and replaces certain
characters with "_" to ensure a reasonable file name. Finally, unless the
-c (accept_all) option is given from the command line, autotag.pl will ask
if it's okay to rename the file.
Listing 10. Mass renaming
# {{{ handle the -rename_only option
if ($config->RENAME_ONLY())
{
foreach my $file (@ARGV)
{
my $tag = get_tag($file, 1);
# the extra parameter will ask us about upgrading V1 to V2
unless (defined $tag)
{
warn "No ID3 TAG info in '$file', skipping";
next;
}
my %map = (
'%c' => 'COMM',
'%s' => 'TIT2',
'%a' => 'TPE1',
'%t' => 'TALB',
'%n' => 'TRCK',
);
my $name = $config->RENAME_FORMAT();
foreach my $key (keys %map)
{
my $tagkey = $map{$key};
my $replacement = '';
if (exists $tag->{$tagkey})
{
$replacement = substr $tag->{$tagkey}, 0, $config->RENAME_MAX_CHARS();
# limit to N characters
if ($tagkey eq 'TRCK' && $replacement =~ m/^\d$/)
{
$replacement = "0$replacement";
}
}
$replacement =~ s/%/{{{%}}}/g;
# this is how we preserve %a in the fields, for example
$name =~ s/$key/$replacement/;
}
$name =~ s/{{{%}}}/%/g; # turn the {{{%}}} back into % in the fields
print "The name after % expansion is $name\n" if $config->DEBUG();
foreach my $char (map { quotemeta } @{$config->RENAME_BADCHARS()})
{
$name =~ s/$char//g;
}
print "The name after character removals is $name\n" if $config->DEBUG();
my $newchar = quotemeta $config->RENAME_REPLACEMENT();
foreach my $char (map { quotemeta } @{$config->RENAME_REPLACECHARS()})
{
$name =~ s/$char/$newchar/eg;
}
print "The name after character replacements is $name\n" if $config->DEBUG();
if ($name eq $file)
{
# do nothing
print "Renaming $file is unnecessary, it already answers to our high standards\n"
if $config->DEBUG();
}
elsif (-e $name)
{
warn "Could not use name $name, it's already taken by an existing
file or directory $file";
}
elsif ($config->ACCEPT_ALL() || read_yes_no("Is name $name OK for '$file'?", 1))
{
next if $config->DRYRUN();
print "Renaming $file -> $name\n";
rename($file, $name);
}
else
{
# do nothing
}
}
exit 0;
}
# }}}
|
 |
Conclusion
The second part of this article will discuss the main loop of
autotag.pl, and show common usage of the program.
Resources
- Read all of Ted's Perl articles in the "Cultured Perl" series on developerWorks.
- Download the autotag application (rename the file to autotag.pl in order to run
it).
- The CPAN module archive contains many useful Perl programs.
- Ted chose the free FreeDB project over the proprietary CDDB / Gracenote as a
database backend.
- You'll find resources on the many different audio formats including MIDI, MP3, and Ogg Vorbis at the Open Directory.
- If you've ever had trouble with
Term:Readline:Perl, try using Term:Readline:GNU instead.
- The id3lib library is for reading, writing, and manipulating ID3v1 and ID3v2 tags.
- The
MP3::Tag CPAN module is for reading tags of MP3 audio files.
- The
Webservice::FreeDB CPAN module retrieves entries from FreeDB by searching for keywords.
- The
MP3::ID3Lib CPAN module allows you to edit and add ID3 tags in MP3 files.
- The CPAN bundle installs the MusicBrainz client and required modules.
- AudioFile-Identify-MusicBrainz, another CPAN interface to the MusicBrainz service, is a pure Perl MusicBrainz client implementation.
- The IBM developerWorks article "Thinking XML: Manage metadata with MusicBrainz" discusses the metadata aspect of the MusicBrainz service.
AppConfig is a Perl5 module for managing application configuration information.
- Learn more about
AppConfig in Ted's column, "Application configuration with Perl".
- Playing around with audio? Learn more about voice-enabling your apps in the IBM developerWorks articles "Introducing XHTML + Voice -- IBM's proposal to the W3C on developing multimodal UIs" and "Multimodal applications."
- You'll also want to take a look at IBM alphaWorks' Voice Toolkit Preview.
- Or, try Music Sketcher, a graphical composition tool from the multimedia gurus at IBM Research.
About the author  | 
|  | Teodor Zlatanov graduated with an M.S. in computer engineering from Boston
University in 1999. He has worked as a programmer since 1992, using Perl,
Java, C, and C++. His interests are in open source work on text parsing,
three-tier client-server database architectures, Unix system
administration, CORBA, and project management. Contact Teodor at tzz@bu.edu. |
Rate this page
|  |