Skip to content

Latest commit

 

History

History
715 lines (585 loc) · 27.1 KB

File metadata and controls

715 lines (585 loc) · 27.1 KB

aws-cloudformation-ecs-senzing-stack-choices

Synopsis

The aws-cloudformation-ecs-senzing-stack-choices AWS Cloudformation template deploys user-specified components of the Senzing stack for use with a previously deployed aws-cloudformation-database-cluster Cloudformation stack that has been initialized by deploying the aws-cloudformation-ecs-senzing-stack-basic Cloudformation stack.

Overview

The aws-cloudformation-ecs-senzing-stack-choices demonstration is an AWS Cloudformation template that creates the following resources:

  1. AWS infrastructure
    1. Elastic IP address
    2. NAT Gateway
    3. Subnets
    4. Routes
    5. IAM Roles and Policies
    6. Logging
  2. AWS services deployed, if required
    1. AWS Cognito
    2. AWS Elastic Container Service (ECS) Fargate
    3. AWS Elastic File System (EFS)
    4. AWS Simple Queue Service (SQS)
  3. Optional Senzing services
    1. Senzing API server
    2. Senzing Entity Search Web App
    3. Senzing Redoer
    4. Senzing SSH access
    5. Senzing Stream-Loader
    6. Senzing Stream-producer
    7. Senzing Xterm
    8. SwaggerUI

The following diagram shows the relationship of the docker containers in this docker composition. Arrows represent data flow.

Image of architecture

This docker formation brings up the following docker containers:

  1. senzing/entity-web-search-app
  2. senzing/jupyter
  3. senzing/redoer
  4. senzing/senzing-api-server
  5. senzing/sshd
  6. senzing/stream-loader
  7. senzing/stream-producer
  8. senzing/xterm

GitHub repository for aws-cloudformation-ecs-senzing-stack-choices.

Contents

  1. Preamble
    1. Legend
  2. Expectations
  3. Demonstrate using AWS Console
  4. Using deployment
  5. Additional topics
  6. Parameters
  7. Outputs

Preamble

At Senzing, we strive to create GitHub documentation in a "don't make me think" style. For the most part, instructions are copy and paste. Whenever thinking is needed, it's marked with a "thinking" icon 🤔. Whenever customization is needed, it's marked with a "pencil" icon ✏️. If the instructions are not clear, please let us know by opening a new Documentation issue describing where we can improve. Now on with the show...

Legend

  1. 🤔 - A "thinker" icon means that a little extra thinking may be required. Perhaps there are some choices to be made. Perhaps it's an optional step.
  2. ✏️ - A "pencil" icon means that the instructions may need modification before performing.
  3. ⚠️ - A "warning" icon means that something tricky is happening, so pay attention.

Expectations

  • Time: Budget 40 minutes to get the demonstration up-and-running.
  • Background knowledge: This repository assumes a working knowledge of:

Demonstrate using AWS Console

Launch AWS Cloudformation

  1. ⚠️ Warning: This Cloudformation deployment will accrue AWS costs. With appropriate permissions, the AWS Cost Explorer can help evaluate costs.
  2. Visit AWS Cloudformation with Senzing template
  3. At lower-right, click on "Next" button.
  4. In Specify stack details
    1. In Parameters
      1. In Security responsibility
        1. Understand the nature of the security in the deployment.
        2. Once understood, enter "I AGREE".
      2. In Senzing installation
        1. Accept the End User License Agreement
      3. In Security
        1. Enter your email address. Example: [email protected]
      4. In Identify existing resources
        1. Enter the stack name of the previously deployed aws-cloudformation-database-cluster Cloudformation stack Example: senzing-db
      5. In Optional: Initial data load
        1. If loading data during deployment is desired, choose "Yes" for "Optional: Would you like to have an initial set of data imported?"
        2. If "Yes" is chosen, the other field specify what data is to be loaded.
      6. In Optional: Service
        1. Individual services can be selected.
    2. At lower-right, click "Next" button.
  5. In Configure stack options
    1. At lower-right, click "Next" button.
  6. In Review senzing-basic
    1. Near the bottom, in Capabilities
      1. Check ":ballot_box_with_check: I acknowledge that AWS CloudFormation might create IAM resources."
    2. At lower-right, click "Create stack" button.

Using deployment

  1. Visit AWS CloudFormation console.
    1. Make sure correct AWS region is selected.
  2. Wait until "senzing-basic" status is CREATE_COMPLETE.
    1. Senzing formation takes about 20 minutes to fully deploy.
    2. May have to hit the refresh button a few times to get updated information.
  3. Click on "senzing-basic" stack.
  4. Click on "Outputs" tab.
  5. Open the "0penFirst" value in a new web browser tab or window.
    1. Because this uses a self-signed certificate, a warning will come up in your browser. Simply continue.
    2. In the "Sign in with your email and password" dialog box, enter the UserName and UserInitPassword values seen in the "Output" tab of the "senzing-basic" stack. This is a one-time password.
    3. In Change Password, enter a new password.

Additional topics

  1. How to load AWS Cloudformation queue
  2. How to migrate Senzing in AWS Cloudformation
  3. How to update Senzing license

Review AWS Cloudformation

The AWS resources created by the cloudformation.yaml template can be see in the AWS Management Console.

  1. CloudFormation
    1. Stacks
  2. CloudWatch
    1. Log groups
  3. Cognito
    1. UserPool
  4. Elastic Compute Cloud (EC2)
    1. Load Balancers
    2. Network interfaces
    3. Target groups
  5. Elastic Container Service (ECS)
    1. Clusters
    2. Task Definitions
  6. Elastic File System (EFS)
    1. File systems
  7. Identity and Access Management (IAM)
    1. Certificates
    2. Policies
    3. Roles
  8. Lambda
    1. Functions
  9. Relational Data Service (RDS)
    1. Databases
    2. Parameter groups
    3. Subnet groups
  10. Route53
    1. RecordSet
  11. Simple Queue Service (SQS)
    1. Queues
  12. System Manager Agent (SSM)
    1. Parameter store
  13. Virtual Private Cloud (VPC)
    1. Elastic IP addresses
    2. Endpoints
    3. Internet gateways
    4. NAT gateways
    5. Network ACLs
    6. Route Tables
    7. Security Groups
    8. Subnets
    9. VPCs

View results

  1. Visit AWS Cloudformation console.
  2. Choose appropriate "Stack name"
  3. Choose "Outputs" tab.

Parameters

Technical information on AWS Cloudformation parameters can be seen at Parameters.

AcceptEula

  1. Synopsis: To use the Senzing code, you must agree to the End User License Agreement (EULA). This step is intentionally tricky to ensure that you make a conscious effort to accept the EULA.
  2. Required: Yes
  3. Type: String
  4. Allowed values: See SENZING_ACCEPT_EULA.
  5. Default: None

CidrInbound

  1. Synopsis: A Classless Inter-Domain Routing (CIDR) value used to limit access to the system. This restricts the inbound traffic to requests from specified IP ranges. Examples:
    1. A system with the value 0.0.0.0/0 allows access from anywhere.
    2. A system with the value 45.26.129.0/24 will allow access from IP addresses in the range 45.26.129.0 to 45.26.129.255
    3. A system with the value 45.26.129.200/32 will allow access from a single IP address 45.26.129.200.
  2. Required: Yes
  3. Type: String
  4. Allowed pattern: Letters and numbers. Specifically: '(?:\d{1,3}\.){3}\d{1,3}(?:/\d\d?)?'
  5. Allowed values: String in IPv4 CIDR format.
  6. Example: 45.26.129.200/32
  7. Default: 0.0.0.0/0

CognitoAdminEmail

  1. Synopsis: An email address of the person administrating this Cloudformation. The email address will be used when email is sent to additional users via the AWS Cognito web console.
  2. Required: Yes
  3. Type: String
  4. Allowed values:
    1. A string in email format.
    2. Example: [email protected]

DatabaseStack

  1. Synopsis: The stack name of the "Senzing aws-cloudformation-database-cluster" deployment. See aws-cloudformation-database-cluster.
  2. Required: Yes
  3. Type: String

RunApiServer

  1. Synopsis: Optionally, run the Senzing API server to create a RESTful API service to the Senzing Engine.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: Yes

RunJupyter

  1. Synopsis: Optionally, run the Senzing Jupyter notebooks to view Jupyter notebooks showing Senzing code samples.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: No

RunRedoer

  1. Synopsis: Optionally, run the redoer to process "redo records"
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: Yes

RunSshd

  1. Synopsis: Optionally, run the sshd container that allows ssh and scp access. Can be used for debugging, copying files to the EFS, or the Senzing Exploratory Tools. This is an economical container. To run a "maxed-out" container, see RunSshdMax.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: Yes

RunSshdMax

  1. Synopsis: Optionally, run the sshd container that allows ssh and scp access. Can be used for debugging, copying files to the EFS, or the Senzing Exploratory Tools. This differs from RunSshd in that it has maximum resource of 30GB Memory, 4 vCPU.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: Yes

RunStreamLoader

  1. Synopsis: Optionally, run the stream-loader' which reads records from the SQS queue and sends them to the Senzing Engine.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: Yes

RunStreamProducer

  1. Synopsis: Optionally, run the stream-producer container that fetches JSON lines from a file and pushes them to the SQS queue. If "Yes" is chosen, SenzingInputUrl, SenzingRecordMin, and SenzingRecordMax need to be specified.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: Yes

RunSwagger

  1. Synopsis: Optionally, run the swaggerapi/swagger-ui container that hosts the SwaggerUI for viewing the Senzing REST API OpenAPI document.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: Yes

RunVpcFlowLogs

  1. Synopsis: Optionally, capture information about the IP traffic going to and from network interfaces in your VPC.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Default: No
  6. References:
    1. VPC Flow Logs.

RunWebApp

  1. Synopsis: Optionally, run the entity-search-web-app which gives a web-based representation of data stored in the Senzing data model.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Example:
  6. Default: Yes

RunXterm

  1. Synopsis: Optionally, run the Senzing Xterm which gives a web-base terminal useful in running command line programs.
  2. Required: Yes
  3. Type: Boolean
  4. Allowed values: [ "Yes" | "No" ]
  5. Example:
  6. Default: Yes

SecurityResponsibility

  1. Synopsis: The Senzing proof-of-concept AWS Cloudformation uses AWS Cognito for authentication, and HTTPS (using a self-signed certificate) for encrypted network traffic to expose services through a single, internet-facing AWS Elastic Load Balancer. With exception of the senzing/sshd container, no tasks in the AWS Elastic Container Service (ECS) have public IP addresses.

    To enable additional security measures for the deployment in your specific environment, you'll need to consult with your AWS administrator. Examples of additional security measures:

  2. Required: Yes

  3. Type: String

  4. Allowed values:

    1. "I AGREE"
  5. Default: None

SenzingDataSource

  1. Synopsis: If using RunStreamProducer, supply the DATA_SOURCE value to be used.
  2. Required: Yes
  3. Type: String
  4. Default: TEST

SenzingEntityType

  1. Synopsis: If using RunStreamProducer, supply the ENTITY_TYPE value to be used.
  2. Required: Yes
  3. Type: String
  4. Default: GENERIC

SenzingInputUrl

  1. Synopsis: If using RunStreamProducer, supply the URL of a tar-gzipped file in JSON-lines format containing records to ingest into Senzing.
  2. Required: Yes if running Stream Producer, otherwise no.
  3. Type: String
  4. Allowed pattern: A URL starting with http:// or https://.
  5. Example: https://www.example.com/my/records.json
  6. Default: https://s3.amazonaws.com/public-read-access/TestDataSets/SenzingTruthSet/truth-set.json

SenzingLicenseAsBase64

  1. Synopsis: To ingest more than 100,000 records, a Senzing license is required. A binary version of the Senzing license, g2.lic, is not usable as a parameter in the text entry field. Instead, a Base64 representation of the information is needed. An example of how to produce base64 from g2.lic on Linux and macOS:

    base64 /opt/senzing/etc/g2.lic

    Copy the entire output from the command and paste into the text entry field.

  2. Required: Yes if ingesting more than 100,000 records, otherwise no.

  3. Type: String

  4. Allowed pattern: Empty or Base64 characters. Specifically ^$|[^-A-Za-z0-9+/=]|=[^=]|={3,}$

  5. Allowed values: Base64 encoded string

  6. Example:

    AQAAADgCAAAAAAAAU2VuemluZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARGVtbyBFeHBpcmVkAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADIwMjAtMTItMTYA
    AAAAAAAAAAAARVZBTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAFNUQU5EQVJEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAKCGAQAAAAAAMTk3Ni0wMS0wMQAAAAAAAAAAAABN
    T05USExZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAARkdIST5XYOZ90kbyAbU7wM7XvPCwq/FgORZIekwFMg8zi3tCD0V5+12q72aqk0E6JOct
    +cPAq/T50N5Pf5nvJZ6TaW3TzQbnH/z5f/ALsWLydE2DPNvq3HuAjkjZpg2h7mb4OUqorGxDI9RX
    TX8hPjzYrBfMdOgl1DlRBVG36WwdpB8AnSfaegbYU+U/vfof+ff6mJk8gzPg+OGPwg21/S6i2TT4
    RbTCSYP/TpfXyJGE6dbQWEC9rFhYuWq3mFF3z7zFEcmxpNfZuBtYsxni8P3sDZ706RA+wcQF7TVg
    giJoK03W8kd6mk3X+fvc4ARJo9RarYInsAvSHKlr1KpxeebuirfqgSz+uEW6pqOD1fV0oHnFncdf
    jV2k2CqmIfThB/ONQcn/4/EIlhdzXqxSlXAGz6C7ApHq6xUCdLILx/NfdUEypHIfyabrpXKOKOPx
    zekhGztEzB0gSJNebEa++EKxHDOc1Sc0YD9q9KvcaGSPTjlCJeaNhufg9Sz/iXZMP+d4Vkp+Bn6p
    mfUPG7tKharEoRChUNfRms8wVyNxmz6LRw5Uy14Dlodd0LyBQRB9Tx8FVYMh5AElwjbQOoDOIRvi
    IQIGsUNp/ZkP7PdBxc/b9o3rjUsZCzyCtP+jflZSqMenzXCsTI1Xay6On2wSVwQdJ1/2eIwKEfCF
    hj4DZlY5+jSo
  7. Default: None

SenzingRecordMax

  1. Synopsis: When using SenzingInputUrl, this indicates the number of the last line that will be read from the file. It is used to limit the number of records ingested into Senzing.
  2. Required: Yes if using SenzingInputUrl, otherwise no.
  3. Type: Number
  4. Allowed pattern: Numbers. Specifically: [0-9]*
  5. Allowed values: 0 = Read entire file; Any positive integer.
  6. Example: 15000000
  7. Default: 0

SenzingRecordMin

  1. Synopsis: When using SenzingInputUrl, this indicates the number of the first line that will be read from the file. Used to skip lines at the beginning of the file. It is handy if the beginning of the file has already been ingested into Senzing.
  2. Required: Yes if using SenzingInputUrl, otherwise no.
  3. Type: Number
  4. Allowed pattern: Numbers. Specifically: [0-9]*
  5. Allowed values: 0 = Read from beginning; Any positive integer.
  6. Example: 100000
  7. Default: 0

SenzingVersion

  1. Synopsis: The version of Senzing installed onto the AWS Elastic File System. More information at Senzing API Version History.

Outputs

0penFirst

  1. Synopsis: An alias for UrlWebApp. Since it's one of the first things to look at, it is listed first.
  2. Details: It is listed first because the name "cheats" and uses a zero instead of a capital "o".

AccountID

  1. Synopsis: The AWS account ID used to create the AWS Cloudformation.

CertificateArn

  1. Synopsis: Amazon Resource Name (ARN) of certificate used for SSL support. More information at AWS LoadBalancer Console. Select a load balancer, view the "Listeners" tab, then click "View/edit certificates".

Host

  1. Synopsis: The hostname of the loadbalancer that is a proxy to all of the services.
  2. Details: More information at AWS Load Balancers console. Also used as the host value when using UrlSwagger.

QueueDeadLetter

  1. Synopsis: The queue to which records that are not able to be ingested into Senzing Engine are sent. In otherwords, if the JSON message is malformed, or Senzing denied inserting into the Senzing Engine.
  2. Details: More information at AWS SQS Console.

QueueInput

  1. Synopsis: The queue from which records are ingested into Senzing Engine. In otherwords, this is the queue where records are sent to be inserted into the Senzing Engine.
  2. Details: More information at AWS SQS Console.

QueueOutput

  1. Synopsis: The queue that is populated with responses from inserting records into the Senzing Engine. This is commonly called "WithInfo" information.
  2. Details: More information at AWS SQS Console.

QueueRedoerDeadLetter

  1. Synopsis: The queue to which records that are not able to be ingested into Senzing Engine resoer are sent.
  2. Details: More information at AWS SQS Console.

QueueRedoerInput

  1. Synopsis: The queue populated by the redoer with records the Senzing Engine identified as needing reevaluation. The queue will be consumed by the fleet of redoers that read from the queue and send to the Senzing Engine for reprocessing. The results will be sent to the QueueRedoerOutput.
  2. Details: More information at AWS SQS Console.

QueueRedoerOutput

  1. Synopsis: The queue that is populated with responses from reprocessing records. This is commonly called "WithInfo" information from the redoer.
  2. Details: More information at AWS SQS Console.

SshPassword

  1. Synopsis: Password to be used when logging into the SSHD container.

SshUsername

  1. Synopsis: User ID to be used when logging into the SSHD container.
  2. Details: Usually "root".

SubnetPublic1

  1. Synopsis: The first of two public subnets created.
  2. Details: See the subnet having a Name in the form {StackName}-ec2-subnet-public-1 in the AWS Virtual Private Cloud console.

SubnetPublic2

  1. Synopsis: The second of two public subnets created.
  2. Details: See the subnet having a Name in the form {StackName}-ec2-subnet-public-2 in the AWS Virtual Private Cloud console.

UrlApiServer

  1. Synopsis: A URL showing how to reach the Senzing API Server directly.

UrlApiServerHeartbeat

  1. Synopsis: A URL showing how to reach the Senzing API Server directly. The /heartbeat URI path simply demonstrates that the API server is responding. For more URIs, see SwaggerUrl output value.

UrlJupyter

  1. Synopsis: A URL showing how to reach the Senzing Jupyter notebooks.

UrlSwagger

  1. Synopsis: A URL showing how to reach the Swagger User Interface.
  2. Usage: To access the Senzing API server
    1. Using the URL, visit the UrlSwagger webpage.
    2. In Servers
      1. From the drop-down, select {protocol}://{host}:{port}{path}.
      2. protocol: https
      3. host: Enter the value of Host
      4. port: 443
      5. path: /api
    3. The HTTP URIs will now access the deployed Senzing API server.

UrlWebApp

  1. Synopsis: A URL showing how to reach the Senzing Entity Search Web App.

UrlXterm

  1. Synopsis: A URL showing how to reach the Senzing Xterm.
  2. Usage: From this Linux terminal, G2Command.py, G2Explorer.py, G2ConfigTool.py, can be run.

UserInitPassword

  1. Synopsis: The one-time password for the UserName.
  2. Details: When the one-time password is used, the user is prompted for a new password. Once a new password is submitted, the one-time password has no value.

UserName

  1. Synopsis: The user name submitted for the CognitoAdminEmail. It is the initial user created to access the system.
  2. Details: To add users, see UserPool

UserPool

  1. Synopsis: The specific UserPool URL. It can be used to add, manage, or delete users for this Cloudformation.