Setup¶
Setup (Cloud)¶
CloudReg is designed to be used in the cloud but components of the CloudReg pipeline can also be run locally. For instructions on full local use see here. For cloud setup please see below. We chose to work with Amazon Web Services (AWS) and the below setup instructions are for that.
Requirements¶
AWS account
IAM Role and User with credentials to access EC2 and S3
S3 Bucket to store raw data
S3 Bucket to store processed data (can be the same bucket as above)
EC2 instance with Docker
EC2 instance with MATLAB
Local computer (used to send commands to cloud services)
(Optional) CloudFront CDN with HTTP/2 enabled for fast visualization
(Optional) Web Application Firewall for IP-address restriction on data access.
Create AWS account¶
Follow instructions to create AWS account or use existing AWS account. All of the following AWS setup instructions should be performed within the same AWS account.
Create IAM Role¶
Log into AWS console
Navigate to IAM section of console
Click on Roles in the left sidebar
Click Create Role
Click AWS Service under Type of Trusted Entity
Click EC2 as the AWS Service and click Next
Next to Filter Policies, search for S3FullAccess and EC2FullAccess and click the checkbox next to both to add them as policies to this role.
Click Next
Click Next on the Add Tags screen. Adding tags is optional.
On the Review Role screen, choose a role name, like cloudreg_role, and customize the description as you see fit.
Finally, click Create Role
Create IAM User¶
Log into AWS console
Navigate to IAM section of console
Click on Users in the left sidebar
Click Add User
Choose a User name like cloudreg_user, check Programmatic Access, and click Next
Click on Attach existing policies directly and search for and add S3FullAccess and EC2FullAccess, and click Next
Click Next on the Add Tags screen. Adding tags is optional. Then click Next
On the Review screen, verify the information is correct and click Create User
On the next screen, download the autogenerated password and key and keep them private and secure. We will need these credentials later when running the pipeline.
Create S3 Bucket¶
Log into AWS console
Navigate to S3 section of console
Click Create Bucket
Choose a bucket name and be sure to choose the bucket region carefully. You will want to pick the region that is geographically closest to you for optimal visualization speeds. Record the region you have chosen.
Uncheck Block All Public Access. We will restrict access to the data using CloudFront and a Firewall.
The remaining settings can be left as is. Click Create Bucket
Set up CORS on S3 Bucket containing processed data/results¶
Log into AWS console
Navigate to S3 section of console
Click on the S3 Bucket you would like to add CORS to.
Click on the Permissions tab
Scroll to the bottom and click Edit under Cross-origin resource sharing (CORS)
Paste the following text:
[ { "AllowedHeaders": [ "Authorization" ], "AllowedMethods": [ "GET" ], "AllowedOrigins": [ "*" ], "ExposeHeaders": [], "MaxAgeSeconds": 3000 } ]
click Save Changes
Set up Docker EC2 instance¶
Log into AWS console
Navigate to EC2 section of console
In the left sidebar, click Instances. Make sure you change the region (top right, middle drop-down menu) to match that of your raw data and processed data S3 buckets.
Click Launch Instances
In the search bar, enter the following: ami-098555c9b343eb09c. This is an Amazing Machine Image (AMI) called Deep Learning AMI (Ubuntu 18.04) Version 38.0. Click Select when this AMI shows up.
The default instance type should be t2.micro, if not choose change it to that type. Leave the remaining choices as their defaults and click Review and Launch.
Verify the EC2 instance information is correct and click Launch.
When the key pair pop-up appears, select Choose an existing key pair if you have already created one, or select Create a new key pair if you do not already have one. Follow the instructions on-screen to download and save the key pair.
Follow AWS tutorial to connect to this EC2 instance through the command line.
Once you have connected to the instance via SSH, create the cloud-volume credentials file on the instance using the CLI text editor of your choice.
Install docker-compose by running
sudo curl -L "https://github.com/docker/compose/releases/download/1.28.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose; sudo chmod +x /usr/local/bin/docker-compose
Run
sudo shutdown now
to turn off the EC2 instanceRecord the “Instance ID” of this CloudReg instance (this can be found in the EC2 console). We will need this when running the pipeline.
Set up MATLAB EC2 instance¶
Follow instructions here on setting up MATLAB on an EC2 instance. Be sure to create this instance in the same region as your S3 buckets. Be sure to use the same SSH key you created for the CloudReg EC2 instance.
After creating this instance, navigate to the EC2 console and record the “Instance ID” of this MATLAB instance. We will need this when running the pipeline.
Set up AWS CloudFront¶
Log into AWS console
Navigate to CloudFront section of console
Click “Create Distribution” and then click “Get Started”.
Click in the “Origin Domain Name” box and select the S3 bucket you previously created to store preprocessed data for visualization. Once you select your S3 bucket from the drop-down menu, the Origin ID should populate automatically.
Leave all other default parameters under “Origin Settings”.
See the video below on how to set up the remaining parameters.
After following the video, click “Create Distribution”.
NOTE: Be sure to save the CloudFront URL that is created for that distribution. It can be found at the CloudFront console homepage after clicking on the distribution you created. It should appear next to “Domain Name”.
Set up AWS Web Application Firewall¶
Before setting up the Web Application Firewall, please find the IP address(es) you would like to give access to. Oftentimes this information can be discovered by emailing IT at your institution or going to whatismyip for just your IP address.
Log into AWS console
Navigate to WAF section of console. This link will redirec you to WAF classic in order to implement our firewall.
In the drop-down menu next to “Filter”, select “Global (CloudFront)”.
Click “Create Web ACL”.
Choose a name that is unique for your web ACL and leave the CloudWatch metric name and Region Name as is.
Click on the drop-down next “AWS resource to associate” and choose the CloudFront distribution you created previously.
Click “Next”
To the right of “IP Match Conditions”, click “Create Condition”.
Choose a unique name and leave the region as “Global”.
Next IP address range, input the IP range that you obtained in step 1. You can verify this range with a CIDR calculator
Click “Create” at the bottom right and then click “Next”.
Click “Create Rule” to the right of “Add rules to web ACL”.
Choose a name and leave the other 2 parameters as default.
Under “Add conditions”, choose “does” and “originate from an IP address in”
Under the third drop-down, choose the rule you created in step 14.
Under “If a request matches all of the conditions in a rule, take the corresponding action”, choose allow.
Under “If a request doesn’t match any rules, take the default action” choose “block all requests that don’t match rules”
Click “Review and Create” and then on the next page choose, “Confirm and create”.
Local machine setup¶
On a local machine of your choice follow the instructions below. The following instructions should be followed from within a terminal window (command line). The below steps only need to be done the FIRST TIME you set up the pipeline.
Install Docker
Make sure Docker is open and running.
Open a new Terminal window.
Pull the CloudReg docker image:
docker pull neurodata/cloudreg:local
Setup (Local)¶
CloudReg is designed to be used in the cloud but components of the CloudReg pipeline can also be run locally. Instructions for local setup are below.
Requirements¶
Local Machine
MATLAB license
Local Machine Setup¶
On a local machine of your choice follow the instructions below. The following instructions should be followed from within a terminal window (command line). The below steps only need to be done the FIRST TIME you set up the pipeline.