Storage disks serve volume files from a cloud storage service.
Storage disks allow you to connect your BIIGLE account with cloud storage services. Files from your storage disks can be used to create new volumes in BIIGLE but remain under your control. Storage disks are different to remote locations because the files require access credentials.
To create a new storage disk, click on "Storage" in the dropdown menu of the navbar at the top. Next, click on the "Storage Disks" item in the navigation on the left (if it is not already selected). This will open a list of all your storage disks. The list shows the names and types (see below) of your storage disks, as well as the time when they were created.
To create a new storage disk, click on "Add a new storage disk" below the list. This will open a form where you first have to choose the storage disk type and name. Then, click and enter the storage disk options which are different for each storage disk type. The options of each type are explained below. Finally, click and the new storage disk will be created.
To edit a storage disk, move the mouse over the item in the list of storage disks, then click on the button. This will open a form where you can update the name and options of the storage disk. You can also delete the storage disk here. Volumes that use a storage disk will not be deleted when the storage disk is deleted. However, the volume files cannot be loaded any more and most volume features will no longer work.
Storage disks will expire 6 months after the last access. This is done to remove their sensitive access credentials from the BIIGLE database if they are no longer used. You will receive a notification if one of your storage requests is about to expire. Storage disks that are about to expire can be manually extended by clicking on the button of the item in the list of your storage disks.
Different types of storage disks are required to connect with different cloud storage services.
S3 is a protocol that is supported by many cloud storage services such as AWS or Backblaze. With these services, files are stored in "buckets" (or sometimes "containers"). An S3 storage disk can connect with one of these buckets.
An S3 bucket must be configured for cross-origin resource sharing (CORS) before it can be used as a storage disk in BIIGLE. Take a look at the remote locations article for more information on CORS and how it must be configured for BIIGLE. Example CORS rules in the JSON format and detailed setup instructions for AWS can be found below. Here is an example for CORS rules in the XML format:
<CORSConfiguration> <CORSRule> <AllowedOrigin>https://biigle.de</AllowedOrigin> <AllowedMethod>GET</AllowedMethod> <AllowedHeader>*</AllowedHeader> <MaxAgeSeconds>30</MaxAgeSeconds> </CORSRule> </CORSConfiguration>
An S3 storage disk has the following options:
The name of the bucket in which the files are stored.
The compute center region where the bucket is located. Leave this field empty if your cloud storage service does not support regions.
The S3 storage endpoint of your cloud storage service. This must be the full URL including the bucket name, region etc. You should find this somewhere in the documentation of the service.
The ID of the access credentials. You should configure an S3 bucket policy that allows only read access with these credentials. Sometimes you may even want to restrict the access to a subset of the files in the bucket.
The secret key of the access credentials (like a password).
Instructions to set up a storage disk with AWS:
Log in to the AWS Management Console.
Create a new S3 bucket (instructions) and upload files to it (instructions). Here we will call the bucket MyBucket
and create it in the eu-west-2
region. These are the values for the Bucket name and Region fields that are required to create a new S3 storage disk. The value of the Endpoint field can also be determined now. In this example, it will be https://MyBucket.s3.eu-west-2.amazonaws.com
. Replace the bucket name and region in the endpoint URL to get the correct value for your storage disk.
Click on the name of the newly created bucket in the bucket list. Then click on the "Permissions" tab.
Scroll down to "Cross-origin resource sharing (CORS)" and click "Edit".
Enter the following configuration:
[ { "AllowedHeaders": ["*"], "AllowedMethods": ["GET"], "AllowedOrigins": ["https://biigle.de"], "ExposeHeaders": [] } ]
Then click "Save changes".
Now we want to create S3 access credentials. To increase security, we want to restrict the credentials to read access to MyBucket
only.
To start, select the IAM service and select "Policies" in the menu on the left.
Click on the "Create policy" button at the top right.
Click on the "JSON" button to enter the policy in the JSON format.
Enter the following JSON policy (replace MyBucket
with the name of your S3 bucket):
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:GetObject", "s3:ListBucket" ], "Effect": "Allow", "Resource": [ "arn:aws:s3:::MyBucket", "arn:aws:s3:::MyBucket/*" ] } ] }
This policy will allow only read access to the bucket.
Now click the "Next" button.
Choose a name for the policy (example: biigle_MyBucket_s3_read_policy
) and click on the "Create policy" button.
If you add more buckets and want to make them available as BIIGLE storage disks, you can create a new policy for each bucket.
Now select "User groups" in the menu on the left and click the "Create group" button.
Choose a name for the group (example: biigle_read_access
) and attach the permission policy that was created above. If you add new policies for new buckets later, you can modify the user group and attach the new policies, too.
Now click "Create group".
Now select "Users" in the menu on the left and click "Add users".
Choose a user name (example: biigle_read_user
) and click "Next". The user does not need access to the Management Console.
Select the user group created above and click "Next". Then click "Create user".
Click on the user name of the newly created user. Then open the "Security credentials" tab.
Scroll down to "Access keys" and click "Create access key".
Select "Third-party service" in the list, acknowledge the warning below and click "Next".
Choose a description (example: biigle_access_key
) and click "Create access key".
Copy the values of "Access key" and "Secret access key". These are the values for the Access key and Secret key fields that are required to create a new S3 storage disk.
Now you can fill all fields that are required to create a new S3 storage disk in BIIGLE.
The Aruna Object Storage (AOS) is a storage service for the German initiative for a national research data infrastructure (NFDI). Before you can start using AOS, you have to sign up for a user account on the website.
While the connection to AOS can be established via the same S3 protocol that is described above, the setup and configuration works a little differently. Here is a description of the S3 options for AOS:
The name of your AOS project.
The endpoint is the URL https://<bucket>.data.gi.aruna-storage.org
where <bucket>
is replaced with the bucket name above.
The "AccessKey" that is provided with new data proxy credentials.
The "SecretKey" that is provided with new data proxy credentials.
Detailed setup instructions:
Log in to the AOS dashboard, select "Explore" and then "Resources" in the menu at the top.
Click on the "Create new" button and create a new project (we call it "myproject" here). The project name is the value of the Bucket name field that is required to create the new storage disk. With the project name you can also fill the Endpoint field.
Now select "Access" and then "Data proxies" in the AOS menu at the top. Choose a data proxy where you would like to store your data. There, click on the "Create Credential" button. The AccessKey is the value of the Access key field and the SecretKey is the value of the Secret key field that is required to create the new storage disk.
Now you have the values for all fields that are required to create the new storage disk. However, one more step is required before you can annotate your data without restrictions in BIIGLE. You have to configure "Cross-origin resource sharing (CORS)". This is done as follows:
Install s3cmd
and run s3cmd --configure
. Enter the access key and secret key from above. Don't change the default region. Enter the S3 endpoint data.gi.aruna-storage.org
and the bucket template %(bucket)s.data.gi.aruna-storage.org
. Leave the remaining options unchanged. Don't run the test with the supplied credentials and save the settings.
Create a file called cors.xml
with the following content:
<CORSConfiguration> <CORSRule> <AllowedOrigin>https://biigle.de</AllowedOrigin> <AllowedMethod>GET</AllowedMethod> <AllowedHeader>*</AllowedHeader> <MaxAgeSeconds>30</MaxAgeSeconds> </CORSRule> </CORSConfiguration>
Then run the following command: s3cmd setcors cors.xml s3://myproject
(you should replace "myproject" with the actual name of your project). That's it. Now CORS is configured for your project.
Here is a brief example for how you can upload files to your project. This is also done with s3cmd
;
Make sure s3cmd
is configured as described above.
Now navigate to the parent of the directory that you want to upload. Upload the whole directory with the following command (replace "mydir" with the name of the directory to upload and "myproject" with the name of your project):
s3cmd put -r mydir s3://myproject/
The directory will be created as a new dataset as part of your AOS project. In BIIGLE, you will see it as a directory in the file browser.