Mp3tag features a internal Web Sources Framework which is parametrised through web sources description files. Using these description files, you can import tag data from theoretically every web site which displays artist/album information via HTML (no JavaScript, ActiveX). You can find many examples at the Web Sources Archive on the Mp3tag forums.
[Name]
Name of the web source, e.g. Discogs.com
[BasedOn]
Base URL of the web source, e.g. http://www.discogs.com
[IndexUrl]
Search URL (%s will be replaced by the search criteria entered by user), e.g. http://www.discogs.com/artist/%s
[AlbumUrl]
Result base URL (URL result from first search pass will be appended), e.g. http://www.discogs.com
[WordSeperator]
Character/string used instead of blanks within the search criteria entered by the user, e.g. +
[IndexFormat]
Format string for splitting the output buffer from the first search pass into different fields. %_url% is needed, e.g. %_url%|%album%|%type%|%label%
[SearchBy]
Field(s) which are offered as search criteria by the web source, e.g. %artist%
[Encoding]
Encoding used for all urls. Can be utf-8, iso-8859-1, url, url-utf-8 or ansi (system codepage will be used).
[ParserScriptIndex]
This key contains a multi-line parser script (start with ...) which parses the search results page for different albums.
[ParserScriptAlbum]
This key contains a multi-line parser script (start with ...) which parses a web page found by the first search pass.
First of all you need to find a way to identify the lines which contain the interesting
bits of information, for example the year of a release. The Mp3tag parser sees output as
lines and characters and you tell the parser how to move from the start to the places you
want to display.
Mp3tag uses a pointer which is positioned at the beginning of the file and which can be moved
with several commands. So how do we move this pointer to the year displayed on a website?
We can either move it down N lines or we can tell it to move down until it finds the text "Year:".
To do the first we would either use the command MoveLine N
(where N is the number of lines) or GotoLine N
-- this either means go down seven lines from where you are or go to the Nth line from the top of the text.
To do the search, the command would be FindLine "Year:"
,
which - well - finds the next line from where you are that contains the text "Year:".
In all cases the pointer would be moved to the first character of the target line.
From there we could tell the pointer to move N steps to the right (MoveChar N
) or to move to
the Nth character in the line (GotoChar N
) or to position the pointer after the text
"Year:" (FindInLine "Year:"
). All of these commands would result in the
pointer being moved to the first digit of the number of hidden files. Please note the difference
between the FindLine
and FindInLine
commands: The earlier goes through lines
from where you are to find a text and places the pointer at the beginning of the line, while the latter looks within
the line where the pointer is and positions it after the found text.
Now that we are where we want, we need tell Mp3tag to store the data. To do this, a Say
command is
used, but we have to find a way to tell it what to say and what not. In this case we want to output the rest of the line
and so we use the SayRest
command. An alternative
would be the SayUntil
" " command, which would output everything until a space character is
found.
So, a script to display the year would look like this:
FindLine "Year:"
FindInLine "Year:"
SayNextNumber
You will notice that the example used the Find-Commands rather than the Move- or Goto-Commands. Whenever you have a chance to use a Find command, please do so, because WWW pages tend to be changed. Using a script that relies on Find-Commands is more likely to survive a change in the raw data than one that relies on absolute positions.
Doing all this only theoretically can be a bit tricky and if you make an error counting lines or
characters you might end up with quite unexpected results. To check what the parser is doing,
you can add a Debug "on" "debug.out"
command to the top of your script.
This will give you an output file which will show you step by step what the parser is doing and
why you end up with a given output.
Many times you will even want to start your script just with a Debug command, to see what data you
actually get for parsing and before you build your script step by step.
Command |
Parameter(s) |
Description |
FindLine | Sn | Find line with first or Nth occurrence of S (starting from the current position) |
FindLineNoCase | Sn | Find line with first or Nth occurrence of S (ignoring case and starting from the current position) |
FindInLine | Sn | Find the next/Nth occurrence of S within the current line |
GotoChar | N | Skip to the Nth character in the current line |
GotoLine | N | Go to Nth line (counting from top) |
MoveChar | N | Move right/left N characters |
MoveLine | N | Move down/up N lines (starting from current position) |
Say | S | Send S to output |
SayUntil | S | Send everything until S to output |
SayUntilML | S | Send everything until S to output searching across multiple lines |
SayRest | Send everything to the end of the current line to output | |
SayNChars | N | Send next N characters to output |
SayOutput | S | Send the content of output S to the current output. The output CurrentUrl is always generated at runtime. |
SayNextNumber | Outputs the next numeric value from the input. | |
SayNextWord | Outputs the next word from the input. | |
SayNewline | Outputs a carriage return, line feed (CR LF) sequence. | |
SayRegexp | Sss | Outputs all matches of the regular expression in the first parameter separated by the string in the second parameter and ignores the line if the string in the third parameter cannot be found. |
Set | Ss | Sets the content of output S to the value s. Resets the content if s is omitted. |
SkipChar | S | Skip any characters contained in S |
If | S | Check for occurrence of S on current position. |
IfNot | S | Check for absence of S on current position. |
Else | Else branch of an If operation. | |
Endif | End of an If/IfNot operation. | |
OutputTo | S | Sets the name of the output buffer of the Say commands to S. |
Do ... While | Sn | Execute the command surrounded by the two commands while S occurs on current position. The optional second parameter limits the execution of the loop to maximal n times.
Do Nesting of Do commands is not allowed. |
Replace | SS | Replaces all occurrences of the first parameter by the second parameter. |
RegexpReplace | SS | Replaces everything that is matched by the regular expression in the first parameter by the string in the second parameter. |
JoinUntil | S | Joins the current line to the next occurrence of S. |
JoinLines | N | Joins the current line with the next N lines. |
KillTag | Ss | Replaces tag S with s in current line (or blank if omitted). |
Unspace | Removes leading and trailing spaces from the current line. | |
Debug | Ssn | Debug output, S= "on" or "off", s is an optional file name. n is an optional maximum file size for the debug file in MB. |
N Required numeric parameter
S Required string parameter (in quotes)
n Optional numeric parameter
s Optional string parameter (in quotes)
All content and graphics are protected by copyright law! Copyright © 2000-2010 Florian Heidenreich