Handbook

This documentation applies to Audiveris version 4.3 and above, for Windows and Linux environments.

Getting started

Java Web Start

The goal is to now drive all installations and runs of Audiveris through Java Web Start. [Some information on former approaches is still available in the Development section]

Instead of multiple installation files, there is now just a single JNLP file available on Audiveris home page to deploy and run the application on different OS'es such as Windows, Linux or Mac, using either 32-bit or 64-bit architectures.

Applications launched with Java Web Start are cached locally and can be automatically checked for updates. An Internet connection is really mandatory only for the very first launch. An already-downloaded application is launched on par with a traditionally installed application, especially as Audiveris shortcuts are installed (through a desktop icon and through a program menu item).

NOTA: As of this writing, this Java Web Start approach works for the Oracle Java environment only, whether on Windows or Ubuntu. Mac is still to go and on Ubuntu, IcedTea Java needs further debugging.

To check or modify which Java Web Start tool is used by default on Ubuntu, use the command:

sudo update-alternatives --config javaws

First launch

To launch Audiveris the very first time, you have several possibilities, regardless of your operating system:

  • Either click on the button located in the upper right corner of this page, or the same button found on Audiveris home page. To be actually visible, this button requires that JavaScript be enabled in your browser. When clicked, it will ensure that an appropriate Java Runtime Environment is installed and then launch the JNLP application.
  • Or click on the following hypertext link to Launch the application. This link appears even if JavaScript is not enabled. However, it assumes that your browser and Java environments are properly configured to handle the linked JNLP file.
  • Or finally from a terminal directly use the command: javaws https://audiveris.kenai.com/jnlp/launch.jnlp
    (Nota: mind the 's' in https. Otherwise you may get errors like "illegal URL redirect")

All methods will trigger the download and processing by javaws (the Java Web Start launcher which is part of Java runtime) of a small XML file named launch.jnlp. Such JNLP file describes the whole download and launch process. The main visible steps are described here after in sequence (most pictures are taken from the Windows environment, their Ubuntu equivalent are very similar):


The local Java runtime downloads the JNLP file and related resources.

The JNLP file describes a set or requirements, including the Java environment. If needed, a more recent Java runtime may get downloaded automatically.


The initial components for Audiveris application are downloaded as needed and cached locally.

The download applies for the first launch only, after that they are taken directly from the local Java cache.


Since this is the first launch, an extension component named the "Audiveris bundle Installer" is called to install the whole bundle of needed software companions.

Compare with the previous window, and notice the name (Audiveris bundle Installer) as well as the JNLP file (installer.jnlp).


You may get a security warning like this one because Audiveris will need to escape the default sandbox and access local disk.

You can safely accept this and even check the option for not showing this warning again.


This is the installer User Interface:

  • The top row allows to select which languages should be supported by the embedded OCR.
  • The second row presents the sequence of companions to install. Some are optional (Examples, Plugins, Training data) and can be selected through a check box.
  • The middle area is meant for display of main messages.
  • The footer provides buttons to launch or cancel the installation.

Note that you cannot change the folder where Audiveris application data is installed. This feature may be provided in a future version.

An item color depends on item current status:

  • pink for an item that is needed (mandatory or selected optional) but not yet installed,
  • green for an installed needed item,
  • gray for an optional item not selected,
  • orange for an item being processed,
  • red for an item which failed to install.

Using the Add button from the language row, you can select languages on top of the predefined ones (deu, eng, fra, ita).

You can remove a language as well, even from the predefined ones, via a right-click on the proper item in the language row. But make sure you don't install OCR with no language at all otherwise it will fail at runtime.


Right after clicking Install, you are prompted for agreement with Audiveris license.

Click View License to open a browser on precise license content.

Select Yes if you accept license terms and want to continue installation.


Each companion is processed in sequence, as displayed by the current heading and the global progress bar.

If some external resource is downloaded, the status text, right above the global progress bar, displays the name of the remote URL being downloaded.

Note the progress bar indicates the global progress (in terms of companions) rather than the current download progress (in terms of bytes downloaded). So be patient, let the download proceed.


Depending on your environment, the installation of software companions may require a write-access to system locations such as c:\Program Files folder under Windows, and similar locations for other OS'es. By default a standard user is not allowed to write to these locations, so you may be prompted for "elevation" to Administrator level to complete the installation.

On Windows you will see this typical UAC (User Account Control) dialog that recent Windows versions use to prompt for user agreement.

Click Yes to proceed and let the installer run.


On Ubuntu, you will see a gksudo or a kdesudo prompt for your password in order to perform the final administrative task.

Enter your password and proceed.


At the end, this message notifies a successful completion. You can now safely exit the installer so that Audiveris application can proceed.

If one or several companions failed to install, you will get instead an installation global failure message (on top of the dialog(s) describing each failure context).

Note that all details of installation, whether successful or not, are kept in a dedicated log file named audiveris-installation-<TIMESTAMP>.log and located in your temporary directory.


This is the end. If installation succeeded you can now see the main Audiveris application window.

Next launches

For the next launches, you can still use either the Launch button or the plain link on Audiveris home page, or the command line in a terminal window.

Since the application is now "installed", you can use the shortcuts as well:

  • The desktop icon:
  • The Audiveris menu item in Windows start menu.

Data is now kept in Java cache. Hence, whatever the way you launch Audiveris JNLP file, all sequences are now very short:

  1. The initial window "Java 7..." or similar appears for a couple of seconds,
  2. The final Audiveris application window follows immediately.

32-bit vs 64-bit: "Tesseract OCR is not installed properly"

A 32-bit OS can run only 32-bit programs, but a 64-bit OS can usually run both 32-bit and 64-bit programs. This applies to the javaws program as well, and typically you can have different Java environments installed on your OS. Java byte-code is OS and architecture independent, but Audiveris needs Tesseract OCR software, a C++ program, accessed through JNI in the same process. The related binary files are installed by Audiveris installer into proper (Windows) system folders: a 32-bit javaws will target 32-bit system folders while a 64-bit javaws will target 64-bit system folders.

If, on following launches, you observe messages like "Error while loading library jniTessBridge" (in the log window) or "Tesseract OCR is not installed properly" (in a popup), they signal an architecture mismatch: you are trying to load 64-bit binaries from a 32-bit javaws/JVM or vice versa. Typically, for some unknown reason, the shortcut installed by JNLP may point to a 32-bit version of Java, this has been observed on Windows when both 32-bit and 64-bit Java environments are installed.

To fix this, you can modify the shortcut as follows. Use a right-click on the icon, to select Properties then Shortcut. The "target" field points to a javaws.exe program. If the field begins with C:\Windows\SysWOW64\javaws.exe then it is pointing to 32-bit Java. So modify it so that it points to 64-bit Java, using either javaws.exe alone if the 64-bit Java appears first in the path, or an explicit link such as C:\Program Files\Java\jre7\bin\javaws.exe.

Better yet, you can even have both 32-bit and 64-bit versions of Audiveris active at the same time, each with its own set of binaries in proper system folders. To do so, you have to install each version, for example (assuming you have uninstalled any previous version):

  1. (32-bit)\javaws https://audiveris.kenai.com/jnlp/launch.jnlp
  2. javaws -uninstall https://audiveris.kenai.com/jnlp/launch.jnlp
  3. (64-bit)\javaws https://audiveris.kenai.com/jnlp/launch.jnlp

The purpose of step #2 (uninstall) is to remove Audiveris from Java cache so that the next call to javaws (step #3) will actually trigger the download of the needed binaries from Audiveris web site. From that point on, you can use either 32-bit javaws or 64-bit javaws (or whatever launch mechanism you choose) and it will launch the corresponding version immediately since both binary sets are available, while Audiveris byte-code is in Java cache.

Application updates:

New versions of Audiveris resources may become available on the server. Every time you launch Audiveris, a quick network access is performed to check if there are new versions of resources. If so, you will be prompted for updating the application if you wish.

Note that due to time-zone differences between server and client, this might result in a delay of up to 24 hours between when a new Jar is uploaded, and when it is recognized as new content to be updated.

Java cache

So, where has Audiveris application gone? In the global Java cache. And you can monitor this Java cache, from the Java Control Panel.

To display the Java Control Panel:

  • On Windows, use the Control Panel and select Java.
  • On Ubuntu, use the terminal command: ControlPanel.

From this panel:

Settings... allows to manage the cache globally and especially through Delete Files... to clear everything and restart from a fresh cache.

View... allows to manage each cached application individually.

More directly, from a terminal in Windows or Ubuntu, the command line:
javaws -viewer
opens both the Java Control Panel and its applications viewer.


This view displays all installed cached Java applications.

Select the Audiveris application, and use either the icons in the tool bar or a right-click for context menu, to:

  • Run the application, either online (connected to the web) or offline (with no web connection).
  • Display the application JNLP file.
  • Install shortcuts if not done yet.
  • Uninstall the application.
  • Display the application web page.

Uninstallation

To uninstall Audiveris, you can:

  • Either use the cache view as shown above, and select the uninstallation action,
  • Or, in a terminal window, use the command line: javaws -uninstall https://audiveris.kenai.com/jnlp/launch.jnlp
    [note the -uninstall option]

This is the window which tells from the Cache Viewer that the application is being removed.

You then get the Audiveris bundle Installer window, this time running for uninstallation. For the time being, the "uninstaller" is just a stub, meaning the software companions are not actually uninstalled and you get the following message almost immediately:


Looking at the Cache Viewer, you can see that Audiveris no longer appears in the Applications section.

But selecting the Deleted Applications section, you can see that Audiveris is listed among the deleted applications. From there, using tool bar icons or right-click, you can:

  • Either re-install the application. This will run quickly since most data parts are still in the cache.
  • Or, actually remove application data from the cache. From that point on, any reinstall will imply a new download.

Loading an image

chula-input

The main purpose of a music scanner like Audiveris is to analyze the image of a sheet of music and transcribe it to the standard symbolic format that all other music applications can read and write for further processing.

To load an image, use the menu File | Input and select some input file.

Another way to load an input file, is to directly use a drag n' drop from the file explorer to the Audiveris application window.

A few hints:

  • All major formats are supported, notably PDF, TIFF, JPG, PNG, BMP.
  • One input file leads to one score. One multi-page input file (PDF and TIFF formats provide this feature) leads to one score composed of as many pages.
  • Prefer gray level images, with pixels value in the 0 - 255 range, to black and white images. Color images are supported as well.
  • If you scan paper sheets by yourself, pay attention to the scan resolution. Best results are obtained with resolution around 300 DPI. Lower resolutions may hide key details while higher ones turn quickly into a significant waste of CPU and memory resources.

Transcription

transcribe

To launch the image transcription, use the menu Score | Transcribe.

The same action can be launched directly from the toolbar icon


score-step

You can as well select the SCORE target through the menu Step | SCORE.

Don't get too impressed by the list of steps available:

  • All steps from LOAD to SCORE are the mandatory steps. In fact they will be processed as needed in sequence when some output is asked for. See the Command Line Interface section for a quick presentation of the steps sequence and the Internals chapter for further details.
  • The subsequent steps (PRINT, EXPORT, PLUGIN), those displayed after the separating line in the step menu, represent the actual outputs. These are the steps that a casual user is interested in.

Main window layout

score
[Click on image for full-size display]

Within a few seconds after selecting the SCORE step (or any other subsequent step) you should get a screen similar to this picture.

Since version 4, Audiveris has merged the former sheet and score views into a single panel. This saves screen space and allows quick visual checking.

Audiveris main window is now composed of 4 panels:

Sheet
This is the large panel in the upper left corner.
The Picture tab presents the input image, while the Data tab presents the objects (sections and glyphs) extracted from the image. In the Data tab, the objects representing staff lines or stems are drawn as thin lines.
Boards
The right panel is a vertical set of boards, that are used as both user input and output. Only basic boards are displayed by default. A right click in this column allows to display or hide selected boards.
Events
The lower left panel is a log of the main events that occurred so far. More details are available in the Audiveris log file (the precise path to this log file is displayed at the top of the event panel)
Errors
The lower middle panel displays a sorted list of detected errors. A click on an error line in this panel moves you to the related location in the sheet panel.

Sheet display modes

Picture tab Data tab
Physical mode pic-physical data-physical
Combined mode pic-combined data-combined
Logical mode pic-logical data-logical

For the sheet panel you can choose between 3 display modes:

  • The physical mode displays the sheet glyphs colorized according to their recognized shape and using their physical coordinates.
  • The logical mode displays the logical score entities built from the interpretation of the physical glyphs.
  • The combined mode is a combination of the physical and logical layers. It displays the logical entities in a translucent manner on top of the physical glyphs.

Using the menu Views | Switch layers you can circle through the different modes Physical / Combined / Logical.
You can also use the F12 function key or the related toolbar icon

Outputs

The transcription data, which results from the SCORE step, can then be further used by:

musescore
  • The PRINT step (or the menu Score | Print...) writes the resulting image into a PDF file. The image is basically the content of the Picture tab in logical mode.
  • The EXPORT step (or the menu Score | Export...) writes a MusicXML file with the exported score entities.
  • The PLUGIN step (or any plugin accessed through the Plugins menu) launches a plugged application on the exported MusicXML file. Refer to the Plugins section for further details.

Note there is no need to manually go through the intermediate steps. For example, loading an input file and selecting a plugin will trigger the steps from SCALE through SCORE + EXPORT + the selected plugin.

Basics

Entities

Let's introduce a short number of basic concepts: runs-and-sections
Pixel
A pixel is the smallest picture element in the input image. A pixel exhibits a specific color, generally a level of gray. Using a binarization filter, we can separate foreground (rather black) pixels from background (rather white) pixels.
Run
A run is a horizontal or vertical vector of pixels of the same kind (foreground or background). A black (foreground) pixel is "assigned" to exactly one run.
Section
A section is a sequence of adjacent black runs, all of the same orientation. Sections do not overlap, hence a run belongs to exactly one section.
Glyph
A glyph is nothing but a set of sections, perhaps from different orientations. A section may belong to many overlapping glyphs at the same time, but is assigned at any moment to at most one (active) glyph.

sections

The following picture presents sections at the end of the GRID step. We can observe:

  • Unassigned vertical sections, displayed in light blue.
  • Unassigned horizontal sections, displayed in light pink.
  • Horizontal sections assigned to staff line glyphs, and displayed in ivory color.

Main user tools

Mouse

Moving

The page image is usually larger than the window where the page is displayed. You can move the display over the page using different means:

  • By moving the scroll bars,
  • By keeping the mouse left button pressed, and moving the selection point near a border of the display,
  • By keeping both mouse buttons pressed, and dragging the image with the selection point.
Zoom

When modifying the zoom factor, the display will remain focused on the selected entities, if any.

It can be adjusted in the range [1:8 to 16:1]

  • By using the vertical logarithmic slider located on the left side of the sheet window.
  • By using the mouse wheel while keeping the CTRL key pressed.
  • By using the rectangular "lasso" while keeping both the keys CTRL and SHIFT pressed. When releasing the mouse, the zoom will be adjusted so that both rectangle sides get fully visible.
  • By using the predefined buttons and , you can adjust the zoom according to the page width or height, respectively.
tuplet links
Tuplet glyph linked to 3 chords
head links
Head glyph reused by 2 chords
Selection modes

There are 2 selections modes available: glyph-based (the default) or section-based. To switch from one mode to the other, use the toggle menu item Views | Enable section selection or the related toolbar icon

The mouse-based selection works as expected, pointing to either glyph entities or section entities.

In section-selection mode, section boundaries are shown while these boundaries do not appear in glyph-selection mode.

In glyph-selection mode, the selected glyph may display links to its related translated entities. The links appear as short straight lines (and are driven by the option Views | Show glyph Translations).
Images next to this paragraph depict:

  1. A tuplet glyph linked to its 3 embraced chords
  2. A note head glyph shared by 2 different chords
Multi-selection

A left-click in an entity area selects this entity (and deselects the entity previously selected if any).

To select several entities:

  • Either select each entity, one after the other, keeping the CTRL key pressed. In section mode, since entities are usually rather small, you don't need to click on each and every section, simply keep the mouse pressed and move the pointer over the desired sections.
  • Or, by dragging the mouse while keeping the SHIFT key pressed, use a rectangular "lasso" to grab all the entities whose bounds are fully contained by the lasso rectangle.

Whatever the selection mode and the number of selected entities, a right-click will display a popup context menu related to these entities.

Boards

boards

By a right-click in boards pane, you get access to boards selection, to customize which boards should be displayed or hidden (this depends on the type of view at hand - picture, runs, data):

Pixel
Displays the current position (point or rectangle) in pixels. The Level field gives the level of gray for the selected pixel. Note that apart Level, all the other fields are both output and input fields. Just modify their values and press return to modify the selected location.
Binarization
Displays the binarization environment that applies for the current pixel.
Run (Hori/Vert)
Displays the current black Run (horizontal or vertical) if any.
Section (Hori/Vert)
Display the current Section (horizontal or vertical) if any. The Id field is both output and input, so a section can be directly selected via its ID.
Glyph
Displays parameters of the selected Glyph if any.
A glyph is collection of sections, and it is never deleted, therefore it is always accessible via its Id.
A glyph is said Active if its sections point back to it, so the selection of one of its sections will select that glyph.
Focus
Allows to browse the whole sheet for specific shapes.
Eval
Displays the result of the glyph evaluation by the neural network evaluator.
The top 5 best shapes are displayed, with their related grade in range 0..100. A red background color indicates a shape manually discarded.
Shape
The shape palette gives access to shape families. Within a family, a shape can be assigned (by double-click) or dragged and dropped to a target location.
Check
There are several Check boards (Barline, Stem, Ledger). They are meant for the advanced user.

Context menu

With a right click in the sheet view, you get a popup menu whose content depends on the current context, notably the selected glyph if any.

Here are the main possibilities:

Measure #m...
Information / actions on the current measure.
Note that the displayed measure number is local to the current page even though the exported measure number will be score based. The difference is noticeable only in a multi-page score.
Slot #s...
Information / actions on the current time slot within the current measure.
Chord #c...
Information / actions on the selected chord(s) in the current time slot.
Glyphs...
If a glyph is selected, as depicted in the example shown, many glyph-related actions are enabled, depending on the number of selected glyphs.
Boundaries...
This allows to start or stop a series of manual modifications of system boundaries. By dragging system boundaries, you can manually and very precisely adjust the broken lines that define inter-system boundaries. When you are done with boundaries modification, stop the session in order to trigger all needed recomputations.

Score parameters

This dialog let you display and modify major parameters. It can be accessed through menu Score | Set parameters... or via the related toolbar icon .

It is organized in several tabs to describe default, score and page scopes:

Default parameters:
(They concern the whole application and persist across application runs)

Language
Define the specification of dominant languages for OCR'ed text (note that you can select several languages)
Binarization
Select the kind of filter (global or adaptive) and adjust the related values.
Tempo
Define the default tempo in number of quarters per minute.
Plugin
Define the default plugin to be launched by the PLUGIN step
Drag n' Drop
Define the default step to be performed when an image file is dropped on the application window
Script
Prompt the user for saving the current script when closing a score
On error
Print out the calling stack whenever any exception is thrown
Parallelism
Define whether machine parallelism should be used.

Score parameters:
(They concern just the current score and apply to all pages in the score, unless otherwise specified at page level)

Parts
This section is specific to the score level and is displayed only when step SCORE has been reached. It defines the name and MIDI instrument for each part

Page parameters:
(They override the default or score parameters for the current page. Pages tabs are present only for multi-page scores)

Here we are telling Audiveris to use an adaptive binarization filter for this particular page.

The two coefficients are adjustable in the adaptive formula:
threshold = meanCoeff * mean + stdDevCoeff * stdDev;

Errors window

Several steps are able to detect possible errors and sometimes to correct them automatically. Remaining errors are displayed in the error window located at the bottom of the main window.

errors-window

The picture next to this paragraph presents the content of this window after running SCORE on the example image. If the error window is not displayed, make sure to open it through menu Views | Display errors window.

The errors list is sorted, and every error message begins with a context indication.

For example, the third error says: S3P1M*14 [glyph#3174] PAGES Dot unassigned.

  1. The location of this message is coded as S3P1M*14 (System 3, Part 1, Measure 14). The measure number is local to the page, and flagged as such by the '*' character in M*14.
  2. There is also a glyph reference: glyph#3174
  3. The step which has detected the error: PAGES
  4. Finally the message itself: Dot unassigned.

For any error in the list, simply clicking on it will move the current focus to where the program thinks the error is located.

Glyph merge

dot-assigned

Let us take the third error signalled in the error window (Dot unassigned).

A dot-shaped glyph is expected to be assigned a precise shape (augmentation dot, repeat dot, staccato) depending on the surrounding entities. Here none of these assignments was found acceptable. It's a hint that the glyph #3174 is perhaps not a dot.

Simply let's click on this error, and the suspicious glyph gets the global focus, as depicted by the image next to this paragraph.

Obviously, this glyph is not a dot. What is it? It seems to be part of a small half circle, which got cut by the image border. Let's try to fix that, by "merging" the two parts of the half circle.
We select the two parts, for example by using a "lasso" as in:

Once the two glyphs are selected, we can click on the SLUR shape if it appears in the top 5 shapes of the Eval board. If not, we use a right-click to get the context popup menu Glyphs... | Build compound as... | Physicals | SLUR.

The two parts are now merged in a single glyph, which is assigned the SLUR shape.

Glyph assignment

The previous section (using a glyph merge) has already used a glyph assignment (via Eval board or context menu).

Let's recap the various assignments ways:

  • Evaluation board, by clicking in one of the top 5 shapes proposed.
  • Context menu, by navigating to the desired shape by Glyphs... | Assign glyph as... | etc,
  • Shape palette. In the Shape board, open the proper shape set, and use a double-click on the desired shape.
  • Copy / paste a shape from one glyph to another. Use the context menu on the first glyph and select Glyphs... | Copy <SHAPE>, then use the context menu on the other glyph and select Glyphs... | Paste <SHAPE>.

We can add another way, using the Glyph board, since clicking on the Deassign button allows to manually deassign a glyph shape.

Glyph split

no-note

Let's click on another signalled error, which says:
S3 [glyph#5076] PAGES Slur with no embraced notes

Here we have a pack of pixels which results from two overlapping objects, a flag and a rest, and the program has been unable to split this big glyph into proper components. Let's do this manually.


sections

We have to work at the section level. To do so, use Views | Enable section selection. The sections boundaries are now visible, as you can see on the picture next to this paragraph.


splitting-sections

Using the left mouse button, while pressing down the CTRL key, allows the user to select as many sections as desired.

Since sections are usually small, the selection gesture is a bit simplified when compared to glyph selection: You don't even have to release the mouse button when moving from one section to the other, simply browse the sections as you would do with an eraser, and all the touched sections will be added to the selection in a "greedy" mode. To really remove a given section from the current selection, release the mouse button and press down the button again on the section to remove (always keeping the CTRL key pressed).
Note, as you select the sections of the "eighth rest" portion of the glyph, that the Eval board continuously tries to recognize a shape out of the selected sections.

When you have selected all the sections that compose the eighth rest, the EIGHTH_REST button should appear in the top 5.
Simply click on the related button, and the shape is assigned to a new glyph composed of the selected sections.
[You can also, when your section selection is ready, use a right click to open the context popup menu and use the Glyphs... | Assign glyph as... item.]


Glyph split

Immediately, the remaining part of the former "big" glyph is recognized as a flag. You can switch back to the normal glyph-level selection mode.

Glyph insertion

In the section above, we have been lucky that the separation between the two glyphs could be found by aggregating sections. But these splits on section borders are not always satisfactory. In that case, the solution is to inject the needed glyphs directly into the sheet structure.

cleaned glyph

For the sake of example, let's suppose that the split above had not been performed. So, we still have this big glyph instead of clearly separated flag glyph and eighth rest glyph. So we'll inject them.

First let's get rid of assigned shapes (FLAG_1 and SLUR), by using the Deassign button, until no more shape is assigned.
Note that when a shape is manually deassigned, the program tries to assign another shape, hence the need for perhaps multiple deassignments. We could also directly assign a CLUTTER shape, to avoid these multiple deassignments. The picture presents the "cleaned" glyph.


dragged flag

In the shape palette, we now select the range dedicated to flags, and then drag the suitable flag shape from the palette to the sheet view. In the picture, you can see the ghost image of the flag being dragged.

We pay attention to correctly position the flag along the stem, and we release the mouse button.


Virtual glyphs

We perform a similar action for the eighth rest, and we have reached the final result.

Just a couple of remarks:

  • For certain shapes, like the flag, it is important to position the glyph with a rather good precision, otherwise the distance between the stem and this flag may be larger than the tolerated margin, and the stem and the flag will not be recognized as connected.
  • For the time being, the DnD is rudimentary: once a glyph has been dropped, it cannot be moved. The only workaround is to delete this virtual glyph (by deassigning it) and reperform the DnD.
  • Another current limitation is that there is yet no way to resize a virtual glyph. So DnD does not really work for such shapes as beams, slurs, crescendos, descrescendos, ... These moving / resizing features are postponed until we integrate a more powerful way to play with glyphs and shape display (certainly the NetBeans Visual Lib).

Text correction

Text-shaped items are processed by the OCR engine to retrieve their actual content.

Make sure to select the proper language to get the best OCR results. You can still change the language afterwards to automatically recall OCR on the detected words, but the best results are achieved when the right language is chosen upfront.

Even with proper language selected, some texts are not correctly OCR'ed.

  • In the Chula example at hand, the number "31" was not detected in the upper right corner. The solution is to select the glyph by a "lasso" and assign the TEXT shape. This triggers the OCR, which should recognize the proper digits.
  • Some words may be assigned a wrong OCR value. To fix this, select the related glyph and in the text field of the Glyph board, directly type the correct word value. You can also modify other attributes of the textual glyph, such as the text role or the text type.

Wiki on whole examples

A Wiki is available online to document the use and evolution of Audiveris software.

Audiveris is installed with a hand-full of typical input files located in the examples folder of program installation directory. Almost all pictures used in this handbook are snapshots of these examples. The processing of these simple examples is documented in this Wiki page.

Another source of examples is the online repository of MusicXML examples as available on MakeMusic site. There you can find a large dozen of examples with their PDF input and their XML / MXL outputs. These are good quality PDF files which exhibit a wide set of representative music features.

Actually, one of the objectives of Audiveris 4.2 release was to be able to process these former Recordare examples with as good results as possible, even if some manual processing was still necessary. The detailed processing of each of these examples is documented in the dedicated Wiki page. Browsing this Wiki should give you a more realistic view of concrete Audiveris processing and manual interactions.

Advanced

Arguments

As we have seen in the Java Web Start section, Audiveris can be launched from the command line via the javaws program. However, this program accepts a limited set of arguments. Type javaws alone to get a description of possible arguments.

javaws accepts JVM arguments but no application arguments, at least directly.

A much more versatile way is to go through the jnlp file itself. This is a basic XML file which describes how to retrieve and launch the Java application. It is this file which is hyperlinked by the Launch button on Audiveris web page and which is also referred to by the javaws command.

The URL of this jnlp file is https://audiveris.kenai.com/jnlp/launch.jnlp. You can simply download this file and modify it locally with a text editor. Then use a double-click on the local file or use a javaws command pointing to the local file instead of the remote one.

JVM arguments

Many kinds of arguments can be provided to the Java Virtual Machine.

A typical use is to define the maximum amount of memory available, via the -Xmx option, for example -Xmx512M for 512 megabytes, or -Xmx2G for 2 gigabytes. (Please note that you cannot practically go beyond the physical memory available on your machine, otherwise memory swapping will severely impact the processing speed). Similarly, the -Xms option defines the initial amount of memory.

JVM arguments can be provided via the javaws command line and via the jnlp file:

javaws command

Use the javaws -J option for each JVM argument. For example, to specify initial and maximum memory values, you can use something like:

javaws -J-Xms512m -J-Xmx1024m https://audiveris.kenai.com/jnlp/launch.jnlp

Mind the fact that there is no space between the "-J" prefix and the actual JVM argument.

jnlp file

In the (local) file, look for the <resources> element and the contained <java> element. Then modify the java-jvm-args attribute as you wish, for example:

[...]
<resources>
    <java href="..." version="..." java-vm-args="-Xms512m -Xmx1024m" />
    [...]
</resources>
[...]

Application arguments

Application arguments can only be provided through the jnlp file.

In the (local) file, look for the <application-desc> element. Then insert there as many <argument> elements as you wish.

Perhaps the most typical use of such arguments is to run Audiveris in batch. For example, to open the input file myFile.pdf and transcribe it in batch to the output file myFile.xml, you would use something like:

[...]
<application-desc main-class="Audiveris">
    <argument>-batch</argument>
    <argument>-input</argument>
    <argument>path/to/myFile.pdf</argument>
    <argument>-export</argument>
    <argument>path/to/myFile.xml</argument>
</application-desc>
[...]

Here is the summary you get when launching Audiveris with the -help argument:

More explanation on Audiveris arguments:

-help
Displays the arguments summary as printed above.
-batch
Launches Audiveris without any Graphic User Interface.
-step (STEPNAME | @STEPLIST)+
Performs all the specified steps (automatically including the steps which are mandatory to get to the specified ones).
'STEPNAME' can be any one of the step names (the case is irrelevant).
These steps will be performed on each sheet referenced from the command line.
-option (KEY=VALUE | @OPTIONLIST)+
Specifies the value of some application parameters (that can also be set via the pull-down menu Tools | Options).
You can state key=value pairs or reference an options file (flagged by an @ sign) that lists key=value pairs (or even other files recursively).
A list file is a simple text file, with one key=value pair per line. Nota: The syntax used is the Properties syntax, so for example back-slashes must be escaped.
-script (SCRIPTNAME | @SCRIPTLIST)+
Specifies some scripts to be read, using the same mechanism as input command below.
These script files contain actions generally recorded during a previous run.
-input (FILENAME | @FILELIST)+
Specifies some image files to be read, either by naming the image file or by referencing (flagged by an @ sign) a file that lists image files (or even other files list recursively). A list file is a simple text file, with one image file name per line.
-pages (PAGE | @PAGELIST)+
Specifies some specific pages to be loaded, either by naming the page number (counted from 1) or by referencing (flagged by an @ sign) a file that lists page numbers (or even other page list recursively). A page file is a simple text file, with one page number per line.
-bench (DIRNAME | FILENAME)
Defines an output path to bench data file (or directory).
This bench data is meant for application monitoring only.
Nota: If the path refers to an existing directory, each processed score will output its bench data to a score specific file created in the provided directory. Otherwise, all bench data, whatever its related score, will be written to the provided single file.
-print (DIRNAME | FILENAME)
Defines an output path to PDF file (or directory).
Same note as for -bench option.
-export (DIRNAME | FILENAME)
Defines an output path to MusicXML file (or directory).
Same note as for -bench option.

Log

All messages displayed in Audiveris log window are also written into a file called audiveris.log. Such log is a simple text file meant for later analysis, and in particular is very useful when filing a bug report or posting a message on Audiveris forum.

The audiveris.log file is by default located in the temp folder of user data. Path and file name can be changed by setting the system property stdouterr through a JVM option as follows:
-Dstdouterr=path/to/some-file.log

The advanced user can precisely customize the logged information by manually editing the configuration file logging.properties located in the settings folder of user config.

Script

Every user action that can impact the result is recorded in the current score script.

By default, you are prompted to save the script when the score is closed. You can override this behavior via the menu Score | Set parameters... or directly by setting the constant omr.script.ScriptActions.closeConfirmation.

For example, say you load the file Dichterliebe01.pdf, set the default language to German (code is deu) in the score parameters, for the first page decide to use an adaptive binarization and finally launch the EXPORT step. You should get the following script (Dichterliebe01.script.xml):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<script file="D:\soft\audiveris\examples\Dichterliebe01.pdf">
    <parameters>
        <language>deu</language>
        <page index="1">
            <adaptive-filter mean-coeff="0.7" std-dev-coeff="0.9"/>
        </page>
    </parameters>
    <step name="EXPORT"/>
</script>
Example of Dichterliebe01 script

Such scripts can be replayed later. Knowledgeable users can even write scripts from scratch and typically submit them in batch mode.

A script can be launched via the command line, via the menu File | Load script, or via drag n' drop from the file explorer.

Plugins

The Plugins menu is based on the content of the plugins folder of Audiveris. Each file, with a .js extension, found in this folder gives birth to a corresponding item in the Plugins menu.

/* ------------------------------------------------ */
/*              m u s e s c o r e . j s             */
/* ------------------------------------------------ */
    
/* Variable to modify according to your environment */
var pathToExec = "P:/MuseScore/bin/mscore.exe";
    
/* Title for menu item */
pluginTitle = 'MuseScore';
    
/* Long description for tool tip */
pluginTip = 'Invoke MuseScore on score XML';
    
/* Build sequence of command line parameters */
function pluginCli(exportFilePath) {
    importPackage(java.util);
    return Arrays.asList([pathToExec, exportFilePath]);
}
Example of MuseScore plugin

Let's take the example of musescore.js plugin:

The purpose of a plugin file is to describe the way an executable must be launched. Note that the plugin does not actually call the executable, Audiveris Java application does this, based on the informations provided by the plugin.

The plugin is implemented as a small piece of JavaScript with just 4 items:

pathToExec
This local variable defines the exact path to the executable as installed on your own environment. Generally, this item is the only one which needs to be customized.
pluginTitle
This variable defines the title to be used by the related menu item.
pluginTip
This variable defines a longer string, to be displayed as the tip related to the menu item.
pluginCli
This is a function provided with the single parameter exportFilePath which contains the path to the MusicXML file exported by Audiveris. The returned value must be the precise sequence of command arguments used when launching the target executable.

Options

This interface, accessible from menu Tool | Options, allows to interactively display and modify data related to Audiveris classes. This is a low level yet powerful way to handle nearly all application data.

The display combines a tree of classes on the left side, and a table on the right side, where details of the logical constants from the containing classes are available for display and modification.

options

The picture represents a typical Options view:

  1. We are in the package named util, (actually, its full name is omr.util, but we drop the ubiquitous omr. prefix) and the class named OmrExecutors, in charge of the tasks handling.
  2. This class has a logging level, currently assigned to INFO. This information comes from the settings/logging.properties file, but can be modified on the fly, thanks to this interface, to any legal logging value (such as DEBUG, INFO, etc...).
    NOTA: these logging level modifications are meant to be temporary, and thus not stored on disk. For persistent modification, please edit the logging property file directly.
  3. The class also contains some logical constants, which are application-level parameters, whose precise value is kept separate from the algorithmic code.
    For example, the constant useParallelism is an instance of Constant.Boolean class. If set to true (its default value) it allows to take advantage of all physical processors available.

To ease the retrieval of pertinent constants, you can use the search field located in the upper part of the Options window. Here, we have just entered the string "parallel" in this field.

How do we define the current value of such logical constant? The JavaDoc of class omr.constant.ConstantManager explains the mechanism in details. In short, the overriding sequence is defined as follows, from lower to higher priority:

  1. SOURCE: The default value as defined in the source code,
  2. USER: The value, if any, found in file run.properties,
  3. CLI: The value, if any, specified in the command line by a -option key=value. The CLI value is persisted in the USER file when running in interactive mode, and not persisted when running in batch.
  4. UI: The value, if any, specified through the Tools | Options user interface. These UI values are persisted in the USER file.

If ever you want to get back to the SOURCE value of a constant, and discard the modifications you have made, simply check the related box in the Log/Modif column.

To restore the SOURCE value for all constants, use the Reset button located at the top of the Options window.

Training

Audiveris has the ability to train the underlying Neural Network evaluator with representative samples.

Note that the program is released with a pre-trained evaluator so the casual user can safely ignore this training section. However, if the score(s) you want to transcribe use some specific music font significantly different from the provided examples, you may consider training the evaluator to better fit your case.

Persistence of manual assignments

You can set a mode in which any manual assignment will be saved as a training sample. This allows to record any isolated sample on the fly and thus make it available for a future training of the shape evaluator.
This data is saved under the /train/samples folder.

This mode is disabled by default. To enable the feature, use the menu item Tools | Persist Manuals. From that point on, any manually assigned glyph will be saved, until the program is exited or the mode is manually reset.

Saving all score samples

Another possibility, when you are really confident with all the glyphs recognized in the current sheet, is to save as a whole the complete set of glyphs as training samples.
This data is saved under the /train/sheets folder.

NOTA: This possibility is very powerful but requires a careful manual inspection beforehand.

Sample verifier

The purpose of this "Sample Verifier" is to provide a user interface to visually review some or all of the various glyphs which are used for training the evaluator.

The main objective is thus to easily identify samples which have been assigned a wrong shape. Using them as part of the training base could severely impact the recognition efficiency of the evaluator. So the strategy, when such a wrong glyph has been identified, is simply to delete the glyph from the base.

Sample verifier

Here is an example of the glyph verifier. The top panels are dedicated to selectors, in that order:

Folders
This selector allows to select one or several folders to search for training material within:
  • /train/symbols for artificial symbols,
  • /train/samples for isolated samples,
  • /train/sheets for whole sheets.
Shapes
This selector displays only the shapes contained in the folders selected. Select your shapes of interest.
Glyphs
This selector displays only the glyphs corresponding to the shapes selected (within the folders selected). Select your glyphs of interest.

The large panel, on the lower right side, is dedicated to the display of the selected glyphs, using their own coordinates. Notice that glyphs that belong to separate sheets can happen to have close coordinates and thus be displayed as overlapping glyphs.

The lower left panel is composed of two main parts:

  1. The Sample navigator drives the loading and display of glyphs from the selection that comes out of the selectors. You can browse through the loaded glyphs.
  2. The Glyph panel, similar to the Glyph board that appears on the main Audiveris windows, is used to display information about the glyph at hand, together with the evaluations performed by the Neural Network evaluator. The Remove button can be used to discard a wrong glyph: this is implemented through the mere deletion of the underlying glyph XML file.

By default, this user interface looks for glyphs files under the /train directory. However, from the Trainer user interface, you can ask to Verify glyphs. In that case, the Sample verifier interface is automatically loaded with the glyphs that failed the validation.

Trainer

This interface is dedicated to the training of the Neural Network evaluator.

Trainer
Selection

This panel is in charge of selecting and loading the glyph XML files, as stored from the predefined symbols and from previous sheet recognitions.

Use the Select Core button when you want to identify a representative Core part within the whole glyph base. The /train/core directory will be emptied and repopulated by the core selected glyph files.

Neural Training

Here we launch and monitor the training of the neural network.
On the left, radio buttons allow to select either the Whole base (default option) or the Core base.

The main decision is to choose between a Re-Train, which consists in retraining from scratch, or only an Inc-Train, which works incrementally on top of the previous training sessions.

For the advanced user, several convergence parameters can be adjusted (although they should be kept close to their default values): the Momentum and Learning Rate.

The training ends when any of the thresholds Max Error (residual error) or Epochs (number of iterations) is reached, or when the Stop button is manually pressed. The trainer continuously stores on disk the snapshot of the latest best configuration. This is the default behavior, but you can also force the trainer, via the Use Last button, to select only the last configuration.

Validation

At any time, even while the neural network is being trained, you can test the evaluator against the selected population (either the Whole base, or the Core base, according to the selection made via the radio buttons. Note that we can train and validate on different bases).

The samples which are either not recognized or (worse) mistaken for another shape are pointed out. The corresponding Verify buttons launch the Sample verifier on the questionable samples to allow a visual check and perhaps the removal of some of them.

Regression Training

This allows to compute the parameters of a linear evaluator, which is less and less used. You can safely ignore this.

Internals

Audiveris is not just a music scanning program. It is also a tool meant to ease the analysis and the development of OMR techniques.

To this end, Audiveris is released with an open source license, and this chapter details the purpose and the outputs of each formalized program step.

Load step

This step is usually implicit. Loading an input file, regardless how this file is selected, is considered as performing the LOAD step. At the end of this step, each page of the input file is displayed as a separate tab in the main window.

At any time, by manually selecting the menu Steps | Load, you can force the program to reload the current page and reach again the same final mandatory step.

Scale step

Before any digital processing can take place, each pixel must be extracted from the image and flagged as either foreground or background:

  • The default binarization filter uses a global approach, based on a single threshold value for all pixels of the image. All pixels with a gray level which is less than or equal to the threshold are considered as foreground, the others as background. This approach assumes that the image illumination is rather uniform.
  • For images with non-uniform illumination, you can select an adaptive filter, which determines the threshold value for each pixel individually, by using mean value and standard deviation in a small window around the pixel at hand.

Example of a non uniform input image, which exhibits a horizontal illumination gradient:

Nota: The internal runs tables are not displayed by default. To get the related displays you have to set the constant omr.grid.GridBuilder.showRuns to true.

Binarization with the global filter: Too much pepper on the left, too much salt on the right!

Binarization with the adaptive filter: The whole staff is now readable.

Now that it can pick up just the foreground pixels, the program can aggregate them into vertical runs. And simply by analyzing the histograms of run lengths for foreground pixels (and for background pixels as well), it retrieves key information about the music sheet.

Via File | Display Scale Plots you can display both histograms.

In this example, the foreground histogram indicates a peak at value 3 which corresponds to the mean staff line thickness.

If there is no peak above the 10% quorum, the image is not likely to contain music staves.

The sharpness of this peak is also a good indication of the scan quality and the width at 15% is used to define line margins.

The second foreground peak value at 12 is the second frequent height and thus corresponds to average beam height.

Similarly, the main background peak at 18 relates to the average background distance from one staff line to the other.

Adding foreground peak value 3 and background peak value 18 leads to 21, which is now considered as the main interline value and thus the key scaling factor for the sheet at hand.

Grid step

Staff lines implemented as natural splines

By aggregating long horizontal sections into filaments, and gathering them into clusters of vertically spaced filaments, Audiveris retrieves staves skeletons.

Similarly, long vertical sections are aggregated into barline filaments.

The crossing of these horizontal and vertical filaments, sometimes fairly wavy, represents a kind of "grid". Because such filaments are often far from being straight lines, they are implemented as natural splines (sequences of bezier curves with continuity up to the second derivative).

From that point on, this grid is taken as the geometric referential for all other entities. Note that this intrinsic referential allows Audiveris to directly cope with skewed page and / or wavy lines with no deskewing or other processing.

If you wish, you can still ask Audiveris to produce a "dewarped" image, by using the referential as the dewarping grid. Doing so, Audiveris GRID step can also be used as a standalone image dewarping preprocessor.
To compute and display the dewarped image, simply set the constant omr.grid.GridBuilder.buildDewarpedTarget to true, and to save the dewarped image to disk set the constant omr.grid.TargetBuilder.storeDewarp to true.

See below the differences between the initial (warped) image and the final (dewarped) image. Note the pixels colors are not modified, only their coordinates are.

Initial warped image
Final dewarped image

Systems step

Manual edition of system boundaries

The SYSTEMS step handles the separation between systems, and the dispatching of all sections and glyphs to their "containing" system.

From that point on, most processings will be done at system level. This limits the amount of entities (sections & glyphs) to search, and allows to process all systems in parallel, thus taking advantage of the computer hardware architecture.

Audiveris tries to define a "smart" border between adjacent systems which assigns the glyphs to the system they logically belong to.

You can still manually modify the border, by starting a boundary edition session via the (right-click) context menu Boundaries... | Start edition. The broken lines of all boundaries are highlighted in red color. With the mouse, you can adjust the border by simply dragging the lines and points. An intermediate point is automatically removed when it gets aligned with the previous and the next point. When you are done, end the session by Boundaries... | Complete edition so that modifications get immediately taken into account.

Measures step

Measures defined by barlines

The MEASURES step uses the barlines candidates to build and check the measures of every system. Global measure consistency is further checked for systems that contain more than a single staff.

Texts step

System filtered image passed to OCR

The TEXTS step works on each system in parallel.

It first builds an image of the system area, hiding all glyphs which are too wide or too high, or which intersect a staff interior.

It then hands this filtered system image over to the OCR engine, which performs a layout analysis of the image and the transcription of the detected text blocks.

Sticks step

Ledgers and stems

The STICKS step searches systems for sticks, either horizontal or vertical.

  • Horizontal sticks are further checked to be assigned the LEDGER shape.
  • Vertical sticks can give birth to STEM entities.

Symbols step

Noteheads in orange and beams in cyan

The SYMBOLS step aggregates unassigned sections that connect either horizontally or vertically into glyphs.

For each glyph, Audiveris computes a series of key parameters based on ART moments (as used by MPEG-7) and feeds the Neural Network evaluator to look for a suitable shape.

Several dedicated patterns are run at system level to further check and correct the glyphs assignments.

The SYMBOLS step iterates at system level on the cycle: aggregation / assignment / patterns.

The final glyphs are displayed with a color that is specific to their assigned shape.

Pages step

Logical entities

The PAGES step works at page level to translate all assigned glyphs to their corresponding score logical entities.

Global consistency checks are run at page level, to adjust parameters such as the time signatures.

Score step

One color for each voice

The SCORE step connects the various pages of a multi-page score (this step is almost void for a single-page score):

  • Connection of system parts (including voices and instruments) across pages
  • Global measure numbering
  • Use of time signatures to check measure durations
  • Connection of orphaned slurs across pages

Using menu Views | Show score voices, or the related F11 key, you can visualize the voices with different colors as shown on the presented picture.

Nota:

  • For the time being, the SCORE step works on the pages loaded as parts of the multi-page score in memory.
  • For large scores that won't fit in memory, a different approach is needed, whereby pages will be separately recognized and saved as MusicXML files. An offline final reduction task will then run to connect these pages XML files. Further work is still to be done, but a prototype is already available in the omr.score.ScoreXmlReduction class.

Development

This section is meant for developers rather than "plain" users of Audiveris.

Building from Source

Use of NetBeans

This is by far the easiest way, whatever your operating system, to download all the source pieces from Kenai Mercurial (Hg) repository and build Audiveris from them.

Assuming you have NetBeans installed, from the IDE top menu, select Team | Mercurial | Clone Other... which will take you through 3 steps:

  1. In the Mercurial Repository step, set the Repository URL to the Kenai address https://hg.kenai.com/hg/audiveris~hg
  2. In the Mercurial Paths step, you can keep the proposed paths
  3. In the Destination Directory step, select the Parent Directory and the Clone Name of your choice, for example "audiveris~hg"

This will clone the remote Kenai repository to the chosen local "audiveris~hg" directory in a couple of minutes and then open the project automatically. Actually, you'll be notified that 2 projects were cloned. This is so because, inside the main project (audiveris), there is a subproject (installer) to handle the installation of Audiveris bundle. Audiveris project depends on Installer project. You can open one or both projects.

Audiveris project

NetBeans signals a minor problem: The Java class omr.WellKnowns depends on a ProgramId class which cannot be found. This ProgramId class provides Hg-based information and, by design, is always dynamically generated before any compile is performed, so this initial problem can safely be ignored and will disappear at first build.

Audiveris application can be built and run in different modes. To switch modes, open the project properties and select the "Web Start" category dialog:

Enable Web Start
checkbox
Codebase
selection
Mode Comments
Unchecked N/A Stand-alone This mode allows to test Audiveris application locally, with no Installer, and with no need to sign the resources.
Nota: It requires that the needed bridging libraries be found in system location (since they are not provided as Java Web Start resources).
For Windows: jniTessBridge.dll in c:\windows\system32
For Ubuntu: libjniTessBridge.so in /usr/lib/jni
Checked "Local execution" Local
Java Web Start
This mode allows to test the Java Web Start mechanisms with all resources kept locally.
"User defined" Remote
Java Web Start
This mode uses Java Web Start with all resources downloaded from a remote location.
Define Codebase Preview as the remote URL, typically https://audiveris.kenai.com/jnlp.
Use a WebDAV, such as BitKinex, to upload the content of the local dist folder to the remote web location before any actual use.

When using Java Web Start (either local or remote) all the resources listed in the JNLP file must be signed, because of the use of native libraries.

This signing is handled automatically by the build mechanism, based on the informations provided in the Signing dialog box through the Customize button.

Here we use a local keystore ("AudiverisLtd.keystore") located in project parent folder and where the alias "signFiles" has been defined using the keytool utility. This alias is then used by the build task to sign the resources. Note that all resources must be signed by the same certificate.

Installer project

One works on the Installer project only to modify the installer features, otherwise it gets automatically built from Audiveris project which depends on it.

In fact, the Installer does not know it is part of a Java Web Start installation. It is a plain Java program called by javaws as an extension of Audiveris installation, with a single argument: "install" for installation and "uninstall" for uninstallation.

It can be modified, tested and debugged in a stand-alone manner, with proper argument ("install" or "uninstall"). To ease compilation and test without a real javaws underneath, the javax.jnlp package, which provides the JNLP services, is simply described through a facade.

A final advice: since the Installer has to write in protected locations, debugging is more convenient when the whole NetBeans session has been launched at administrator level.

Customization

The build process uses global variables gathered in dev/build.default.properties file.

To adapt the build process to your environment, you can override some of these variables in a potential build.properties file. This file can be located in the dev directory or in the user directory. For Windows, the user directory is %APPDATA%/AudiverisLtd/audiveris.

Organization
For your information, the general organization of the target Audiveris folders is detailed in the Folders section.

Linux specials

Please note that this section is obsolete since users are now encouraged to use the single Java Web Start approach available for Windows and for Ubuntu. It can still be useful for other Linux versions or for people that prefer building Audiveris on their own.

Due to impossibility to support the sheer number of Linux distributions and different binary formats used, we provide fully automated installation packages only for Ubuntu.

For all other distributions we provide generic application packages coupled with detailed installation instructions. On non-Ubuntu Linux you would need to install various required components manually. For further information, please refer to the Manual installation section

Full installation on Ubuntu 12.04 and above

Note: Our Ubuntu packages won't work on versions prior to 12.04 or architectures other than Intel i386/amd64.

The Ubuntu 12.04 packages are available on the download area of Audiveris project and match the audiveris-V.v.r-ubuntu-arch.deb naming schema, where V.v.r denotes Audiveris version+revision and arch the hardware architecture (i386 for Intel 32bit or amd64 for 64bit).

Generic Audiveris package

Note: This generic Audiveris application package is intended to install on any Linux the Audiveris application, music font and desktop integration. However, this package requires that Tesseract shared libraries are already present or the installation will fail. So, refer to section Installing Tesseract OCR to install Tesseract beforehand.

Generic RPM packages suitable for installation on OpenSUSE, Fedora, Madriva and others are available on the download area of Audiveris project. The packages match the audiveris-V.v.r-generic-arch.rpm naming schema, where V.v.r denotes Audiveris version+revision and arch the hardware architecture (i386 for Intel 32bit or x86_64 for 64bit).

Other targets than Ubuntu, OpenSUSE or Fedora

The basic steps will be the same as given in the Manual installation section, though the real commands will vary heavily depending on your distribution and hardware architecture. You can post requests on Audiveris forum if you need additional instructions.

Manual Installation

This section provides step-by-step guide on how to install Audiveris on different, mainly non-Ubuntu, Linux distributions, because those don't provide pre-built binary libraries for Tesseract 3.02. You are assumed to be familiar with the process of compiling and installing software from source. Moreover, this guide assumes some basic level of familiarity with command line operation.

Basically, the installation of Audiveris on a particular Linux distribution consists of the following steps (most of them require the user to type commands in a terminal):

  • (compiling and) installing Google's Tesseract optical character recognition engine
  • installing language files for Tesseract
  • installing Java Runtime Environment
  • installing Ghostscript - an open-source tool for manipulating PDF documents
Installing Tesseract OCR

At the time of this writing, Tesseract project doesn't officially provide binary packages for Linux. Therefore, you need to compile and install Tesseract libraries manually.

Installing Tesseract on OpenSUSE 12.2

OpenSUSE 12.2 offers unofficial binary packages for Tesseract' shared libraries. This simplifies the installation greatly. Please proceed as follows:

  1. in your browser, go to Leptonica download page , select OpenSUSE 12.2 then Show unstable packages. Click on the proper package from Lazy_Kent according to your hardware architecture (32 or 64 bit) and follow the instructions to install Leptonica.
  2. in your browser, go to Tesseract download page , select OpenSUSE 12.2 then Show unstable packages. Click on the proper package from Lazy_Kent (version 3.02.02!) according to your hardware architecture (32 or 64 bit) and follow the instructions to install Tesseract' shared libraries.
Installing Tesseract on Fedora 17

Due to the lack of any binary packages for Tesseract in the Fedora distribution, you need to compile this software manually.

Installing development tools

In order to be able to compile software from sources, you need to install development tools first because those are likely not present on your system by default. Please run the following commands in your terminal:

sudo yum groupinstall "Development Tools"

Installing Tesseract dependencies

Tesseract OCR depends on the image processing library called Leptonica, which, in turn, depends on several image formats like JPEG, PNG, TIFF and GIF. These formats are usually supported by external libraries. You need to install the development versions for each of those libraries:

sudo yum install zlib-devel libpng-devel libjpeg-devel libtiff-devel giflib-devel

Now check out Leptonica's source code and unpack it:

sudo yum install wget
wget http://www.leptonica.com/source/leptonica-1.69.tar.gz
gunzip leptonica-1.69.tar.gz
tar -xvf leptonica-1.69.tar

Alternatively, you can download the source using your browser here: http://www.leptonica.com/download.html

Let us configure, build and install Leptonica now:

cd [path to leptonica-1.69 folder]
./configure --disable-programs
make
sudo make install

This will install Leptonica libraries to /usr/local/lib.

Compiling and installing Tesseract
  • Download Tesseract source using your browser or wget from here: http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz
  • Unpack it to an easily accessible directory, for example /home/user/Documents/. You should see tesseract-ocr folder in your working directory.
  • Start up the terminal and change to that source directory:
    cd /home/user/Documents/tesseract-ocr
  • Now run the following commands in order to compile und install Tesseract to /usr/local/lib:
    ./autogen.sh
    ./configure
    make
    sudo make install

The following instructions in this section are only valid for the Fedora 17 distribution!

Fedora, as a Red-Hat-derived distribution, doesn't include /usr/local/lib in its library search path. Therefore, it's necessary to add that path so that Tesseract shared libraries can be be found by Audiveris application. Please proceed as follows:

su - -c 'gedit /etc/ld.so.conf'

Append the following path to the end of the file if it's not already there:

/usr/local/lib

Save the file and reconfigure the dynamic linker by issuing the following command:

sudo ldconfig
Installing language files for Tesseract

Tesseract requires several language data files to be present on your machine. The following commands will install data for English, German, French and Italian languages:

  • Create the working directory named tesseract-ocr somewhere, for example: /home/user/Documents/tesseract-ocr. If you have compiled Tesseract from sources, this will be the source code directory.
  • Download required language files from here: http://code.google.com/p/tesseract-ocr/downloads/. These files follow the naming schema tesseract-ocr-3.02.XXX.tar.gz where XXX denotes language code, i.e. eng = English, deu = German, ita = Italian and fra = French.
  • Place the obtained language files into parent directory of tesseract-ocr, i.e. /home/user/Documents/.
  • Double-click them to unpack. The data will be automatically extracted into tessdata directory inside of the tesseract-ocr directory.
  • Start up the terminal and run the following command:
    sudo cp -r path_to_tessdata_folder destination
    where path_to_tessdata_folder denotes /home/user/Documents/tesseract-ocr/tessdata in this example and destination - either /usr/local/share when compiled from source or /usr/share when installed using OpenSUSE binary packages.

You need to set TESSDATA_PREFIX environment variable to point to the parent directory of the tessdata in order to tell Audiveris where language data are located.

To edit the global profile, run the following command in the terminal:

  • su - -c 'gedit /etc/profile.local' on OpenSUSE,
  • su - -c 'gedit /etc/profile.d/local.sh' on Fedora.

Append the following line:

  • export TESSDATA_PREFIX=/usr/local/share on Fedora 17,
  • export TESSDATA_PREFIX=/usr/share on OpenSUSE.

Save the file, then log out and log in again in order to activate the changes.

Installing Java Runtime Environment (JRE)

Audiveris requires Java 7. Please install it as follows:

  • sudo zypper install java-1_7_0-openjdk on OpenSUSE,
  • sudo yum install java-1.7.0-openjdk on Fedora.

To test your JRE installation, run the following command:

java -version

You should see something similar to:

java version "1.7.0_09-icedtea"
OpenJDK Runtime Environment (fedora-2.3.3.fc17-x86_64)
OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode)
Installing Ghostscript

Audiveris requires Ghostscript (an interpreter for the PostScript language and for PDF) in order to deal with PDF files. Ghostscript is already installed by default on both OpenSUSE and Fedora. Audiveris requires Ghostscript 9.05 or higher. To verify that, run the following command in the terminal:

gs --version

Folders

This article describes the organization of folders on the target machine, depending on the operating system.

Some folders are annotated with a CONSTANT_NAME which represents a direct link from Java code.

System

Libraries that can be shared at system level. This applies for Tesseract libraries. These shareable libraries are installed by Audiveris bundle installer.

OSRoot
Windows C:/Windows/System32/
Ubuntu /usr/lib/jni
Generic ???
Folder Description Content
. (Windows) Shared libraries liblept168.dll
libtesseract302.dll
. (Ubuntu) Shared libraries libjniTessBridge.so

Companions

This concerns only the two native companion programs used by Audiveris application, namely Ghostscript and Tesseract. (Other companions are used but they are Java libraries, and are simply located next to Audiveris Java archive).

Ghostscript

A Ghostscript sub-process is used to convert on-the-fly a PDF input file into a temporary TIFF file which is then read by Audiveris.

A suitable Ghostscript version is checked and installed as needed by Audiveris bundle installer.

Tesseract

Tesseract is used as the OCR engine, called repeatedly by Audiveris via a JNI interface.

  1. Because of the size of their related files, just a few languages are provided by default through Audiveris installation.
    • On Windows, they are located by default in the folder for 32-bit applications: c:\Program Files (x86)\tesseract-ocr\.
      However, if Tesseract application itself has been installed from Tesseract web site, the TESSDATA_PREFIX environment variable points to the target folder. But note that Tesseract application is not needed by Audiveris, which uses a DLL library.
    • On Linux, they are located in the shareable tesseract folder /usr/share/tesseract-ocr/
  2. Additional languages can be downloaded and installed through the Audiveris bundle installer.

Application

Audiveris application program and (read-only) data are no longer directly visible. They are handled by the global Java cache, and "visible" only through the cache viewer.

For your information, this concerns the Audiveris main jar file, plus all the needed Java libraries as well as the needed resources including the OS- and ARCH-dependent JNI bridges ("jniTessBridge.dll" or "libjniTessbridge.so") to shareable native libraries.

Specific notes for Ubuntu:

  • Folder /usr/bin contains an "audiveris" executable file, a shell script to launch Audiveris application.
  • Folder /usr/share/doc/audiveris contains a "copyright" file that describes licenses and copyrights about Audiveris companions.
  • Folder /usr/share/icons/audiveris contains audiveris icon as an "audiveris.png" file.

GUI persistency

User persistency of graphic interface across applications runs.

OSRoot
Windows %APPDATA%/AudiverisLtd/audiveris/
Ubuntu ~/.audiveris/
Generic ???
Folder Description Content
. User interface persistency One file per frame:
aboutDialog.session.xml
mainFrame.session.xml
optionsFrame.session.xml
SampleVerifierFrame.session.xml
scoreParams.session.xml
trainerFrame.session.xml
etc...

User config

User-specific read-write configuration data (CONFIG_FOLDER), installed by Audiveris bundle installer.

OSRoot
Windows %APPDATA%/AudiverisLtd/audiveris/config/
Ubuntu ~/.config/AudiverisLtd/audiveris/
Generic $XDG_CONFIG_HOME/AudiverisLtd/audiveris/
Folder Description Content
. Configuration files run.properties
user-actions.xml
./plugins
PLUGINS_FOLDER
Javascript plugins finale-notepad.js
finale.js
musescore.js

User data

User-specific read-write data (DATA_FOLDER), installed by Audiveris bundle installer.

This also includes the folders examples and www which, stricly speaking, should be read-only application data but need to be browsable and thus could not be left packaged in the Java cache.

OSRoot
Windows %APPDATA%/AudiverisLtd/audiveris/data/
Ubuntu ~/.local/share/AudiverisLtd/audiveris/
Generic $XDG_DATA_HOME/AudiverisLtd/audiveris/
Folder Description Content
./benches
DEFAULT_BENCHES_FOLDER
Default location for results of program bench One file SCORE.bench.properties per score benched
./eval
EVAL_FOLDER
User-trained data neural-network.xml
linear-evaluator.xml
./examples
EXAMPLES_FOLDER
Examples of input images allegretto.png
autothreshold_test.JPG
batuque.png
carmen-1.png
chula.1bit.bmp
chula.png
cucaracha.png
Dichterliebe01.pdf
SchbAvMaSample.pdf
zizi.png
./print
DEFAULT_PRINT_FOLDER
Default location for score printing One file SCORE.pdf per print
./scores
DEFAULT_SCORES_FOLDER
Default location for MusicXML export One file SCORE.xml per export
./scripts
DEFAULT_SCRIPTS_FOLDER
Default location for user scripts One file SCORE.script.xml per script
./temp
TEMP_FOLDER
Temporary files audiveris.log (log file)
00n-sS-scanSystem.tif (system image)
00n-sS-gGGGG-retrieveOcrLine.tif (glyph image)
./train
TRAIN_FOLDER
Material for training
./train/samples Isolated samples One folder for samples of same sheet
./train/sheets Whole sheet glyphs One folder per sheet
./train/symbols
SYMBOLS_FOLDER
Predefined symbols based on MusicalSymbols font One file SHAPE.xml per shape
./www
DOC_FOLDER
Application documentation The whole set of Audiveris documentation, including home page and handbook.