back Novelty detection stage

Using novelty detection to obtain training data.

This method to obtain training data processes all images of a volume with an unsupervised novelty detection method. The novelty detection method attempts to find "interesting" objects or regions in the images, which are called "training proposals". The novelty detection acts without any prior knowledge of what is actually defined as interesting by you or anyone who wants to explore the images. Hence, the quality or meaningfulness of the training proposals may vary dramatically, depending on the images themselves and on what you are looking for.

To make the novelty detection as flexible as possible, there are many parameters that can be configured before a new MAIA job is submitted. You might have to try out a few parameter combinations before the novelty detection produces meaningful training proposals. In cases where the novelty detection produces too few meaningful training proposals or does not work at all, you can try one of the other methods to obtain training data: existing annotations or knowledge transfer.

Once the novelty detection is finished for a MAIA job, it will continue to the training proposals stage.

Configurable parameters

By default only the one parameter for the novelty detection is shown, that is the most likely to be modified for each new job. To show all configurable parameters, click on the button below the form.

Number of image clusters

Integer between 1 and 100. Default: 5

This parameter specifies the number of different kinds of images tha you expect. Images are of the same kind if they have similar lighting conditions or show similar patterns (e.g. sea floor, habitat types). Increase this number if you expect many different kinds of images. Lower the number to 1 if you have very few images and/or the content is largely uniform.

If your novelty detection results in many large training proposals for very dark, bright or otherwise unusual images, consider to increase the number of image clusters and run the novelty detection again in a new MAIA job.

The number of image clusters is denoted as K in the MAIA paper [1].

Patch size

Integer between 3 and 99. Default 39

This parameter specifies the size in pixels of the image patches used to determine the training proposals. Increase the size if the images contain larger objects of interest, decrease the size if the objects are smaller. Larger patch sizes take longer to compute. Must be an odd number.

The size of objects or regions of interest is relative to the image and directly depends on the distance between the objects and the camera (e.g. the sea floor and the camera). If the camera is close to the objects, choose a larger patch size, if the camera is farther away, choose a smaller patch size.

The patch size is denoted as re in the MAIA paper [1].

Threshold percentile

Integer between 0 and 99. Default 99

This is the percentile of pixel saliency values used to determine the saliency threshold. Lower this value to get more training proposals. The default value should be fine for most cases.

The threshold percentile is denoted as P99 in the MAIA paper [1]. The 99 is the parameter that you can configure here.

Latent layer size

Number between 0.05 and 0.75. Default 0.1

The learning capability used to determine training proposals. Increase this number to ignore more complex "uninteresting" objects and patterns.

The latent layer size is denoted as the compression factor in s = [0.1r] in the MAIA paper [1]. The 0.1 is the parameter that you can configure here.

Number of training patches

Integer between 1000 and 100000. Default 10000

The number of training image patches used to determine training proposals. You can increase this number for a large volume but it will take longer to compute.

The number of training patches is the number of samples that is used to train an autoencoder network for one cluster of images Uk as described in the MAIA paper [1].

Number of training epochs

Integer between 50 and 1000. Default 100

This parameter specifies the time spent on training of an autoencoder network when determining the training proposals. The more time is spent on training, the more complex "uninteresting" objects or patterns can be ignored.

The number of training epochs is the number of epochs that each autoencoder network is trained for one cluster of images Uk as described in the MAIA paper [1].


Integer between 1 and 10. Default 2

A higher stride increases the speed of the novelty detection but reduces the sensitivity to small regions or objects. Set the stride to 1 to disable the speed optimization and process the images in their original resolution. In the MAIA paper [1], we found that a stride of 2 does not reduce the performance of the novelty detection. You might be able to use an even higher stride than that.

The stride is used for the "convolution operation" in which each DBMk is applied to the images of a cluster Uk as described in the MAIA paper [1].

Ignore radius

Integer greater than or equal to 0. Default 5

Ignore training proposals or annotation candidates which have a radius smaller or equal than this parameter in pixels. You can use this to filter out training proposals that have a smaller size than the objects or regions of interest that you expect. Fewer training proposals mean a lower workload for you in the training proposals stage of MAIA. The default value of 5 pixels is sensible because it is unlikely that smaller objects can be accurately identified.

In the MAIA paper [1] no training proposals were ignored which is equivalent to an ignore radius of 0.

Further reading


  1. Zurowietz, M., Langenkämper, D., Hosking, B., Ruhl, H. A., & Nattkemper, T. W. (2018). MAIA—A machine learning assisted image annotation method for environmental monitoring and exploration. PloS one, 13(11), e0207498. doi: 10.1371/journal.pone.0207498