The aws-cloudformation-ecs-senzing-stack-choices
AWS Cloudformation template
deploys user-specified components of the Senzing stack for use with a previously deployed
aws-cloudformation-database-cluster Cloudformation stack
that has been initialized by deploying the
aws-cloudformation-ecs-senzing-stack-basic Cloudformation stack.
The aws-cloudformation-ecs-senzing-stack-choices
demonstration is an AWS Cloudformation template that creates the following resources:
- AWS infrastructure
- Elastic IP address
- NAT Gateway
- Subnets
- Routes
- IAM Roles and Policies
- Logging
- AWS services deployed, if required
- AWS Cognito
- AWS Elastic Container Service (ECS) Fargate
- AWS Elastic File System (EFS)
- AWS Simple Queue Service (SQS)
- Optional Senzing services
- Senzing API server
- Senzing Entity Search Web App
- Senzing Redoer
- Senzing SSH access
- Senzing Stream-Loader
- Senzing Stream-producer
- Senzing Xterm
- SwaggerUI
The following diagram shows the relationship of the docker containers in this docker composition. Arrows represent data flow.
This docker formation brings up the following docker containers:
- senzing/entity-web-search-app
- senzing/jupyter
- senzing/redoer
- senzing/senzing-api-server
- senzing/sshd
- senzing/stream-loader
- senzing/stream-producer
- senzing/xterm
GitHub repository for aws-cloudformation-ecs-senzing-stack-choices.
- Preamble
- Expectations
- Demonstrate using AWS Console
- Using deployment
- Additional topics
- Parameters
- Outputs
At Senzing, we strive to create GitHub documentation in a "don't make me think" style. For the most part, instructions are copy and paste. Whenever thinking is needed, it's marked with a "thinking" icon 🤔. Whenever customization is needed, it's marked with a "pencil" icon ✏️. If the instructions are not clear, please let us know by opening a new Documentation issue describing where we can improve. Now on with the show...
- 🤔 - A "thinker" icon means that a little extra thinking may be required. Perhaps there are some choices to be made. Perhaps it's an optional step.
- ✏️ - A "pencil" icon means that the instructions may need modification before performing.
⚠️ - A "warning" icon means that something tricky is happening, so pay attention.
- Time: Budget 40 minutes to get the demonstration up-and-running.
- Background knowledge: This repository assumes a working knowledge of:
⚠️ Warning: This Cloudformation deployment will accrue AWS costs. With appropriate permissions, the AWS Cost Explorer can help evaluate costs.- Visit AWS Cloudformation with Senzing template
- At lower-right, click on "Next" button.
- In Specify stack details
- In Parameters
- In Security responsibility
- Understand the nature of the security in the deployment.
- Once understood, enter "I AGREE".
- In Senzing installation
- Accept the End User License Agreement
- In Security
- Enter your email address. Example:
[email protected]
- Enter your email address. Example:
- In Identify existing resources
- Enter the stack name of the previously deployed
aws-cloudformation-database-cluster
Cloudformation stack
Example:
senzing-db
- Enter the stack name of the previously deployed
aws-cloudformation-database-cluster
Cloudformation stack
Example:
- In Optional: Initial data load
- If loading data during deployment is desired, choose "Yes" for "Optional: Would you like to have an initial set of data imported?"
- If "Yes" is chosen, the other field specify what data is to be loaded.
- In Optional: Service
- Individual services can be selected.
- In Security responsibility
- At lower-right, click "Next" button.
- In Parameters
- In Configure stack options
- At lower-right, click "Next" button.
- In Review senzing-basic
- Near the bottom, in Capabilities
- Check ":ballot_box_with_check: I acknowledge that AWS CloudFormation might create IAM resources."
- At lower-right, click "Create stack" button.
- Near the bottom, in Capabilities
- Visit AWS CloudFormation console.
- Make sure correct AWS region is selected.
- Wait until "senzing-basic" status is
CREATE_COMPLETE
.- Senzing formation takes about 20 minutes to fully deploy.
- May have to hit the refresh button a few times to get updated information.
- Click on "senzing-basic" stack.
- Click on "Outputs" tab.
- Open the "0penFirst" value in a new web browser tab or window.
- Because this uses a self-signed certificate, a warning will come up in your browser. Simply continue.
- In the "Sign in with your email and password" dialog box, enter the UserName and UserInitPassword values seen in the "Output" tab of the "senzing-basic" stack. This is a one-time password.
- In Change Password, enter a new password.
- How to load AWS Cloudformation queue
- How to migrate Senzing in AWS Cloudformation
- How to update Senzing license
The AWS resources created by the cloudformation.yaml template can be see in the AWS Management Console.
- CloudFormation
- CloudWatch
- Cognito
- Elastic Compute Cloud (EC2)
- Elastic Container Service (ECS)
- Elastic File System (EFS)
- Identity and Access Management (IAM)
- Lambda
- Relational Data Service (RDS)
- Route53
- Simple Queue Service (SQS)
- System Manager Agent (SSM)
- Virtual Private Cloud (VPC)
- Visit AWS Cloudformation console.
- Choose appropriate "Stack name"
- Choose "Outputs" tab.
Technical information on AWS Cloudformation parameters can be seen at Parameters.
- Synopsis: To use the Senzing code, you must agree to the End User License Agreement (EULA). This step is intentionally tricky to ensure that you make a conscious effort to accept the EULA.
- Required: Yes
- Type: String
- Allowed values: See SENZING_ACCEPT_EULA.
- Default: None
- Synopsis: A Classless Inter-Domain Routing (CIDR) value used to limit access to the system.
This restricts the inbound traffic to requests from specified IP ranges.
Examples:
- A system with the value
0.0.0.0/0
allows access from anywhere. - A system with the value
45.26.129.0/24
will allow access from IP addresses in the range45.26.129.0
to45.26.129.255
- A system with the value
45.26.129.200/32
will allow access from a single IP address45.26.129.200
.
- A system with the value
- Required: Yes
- Type: String
- Allowed pattern: Letters and numbers. Specifically:
'(?:\d{1,3}\.){3}\d{1,3}(?:/\d\d?)?'
- Allowed values: String in IPv4 CIDR format.
- Example: 45.26.129.200/32
- Default: 0.0.0.0/0
- Synopsis: An email address of the person administrating this Cloudformation. The email address will be used when email is sent to additional users via the AWS Cognito web console.
- Required: Yes
- Type: String
- Allowed values:
- A string in email format.
- Example:
[email protected]
- Synopsis: The stack name of the "Senzing aws-cloudformation-database-cluster" deployment. See aws-cloudformation-database-cluster.
- Required: Yes
- Type: String
- Synopsis: Optionally, run the Senzing API server to create a RESTful API service to the Senzing Engine.
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: Yes
- Synopsis: Optionally, run the Senzing Jupyter notebooks to view Jupyter notebooks showing Senzing code samples.
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: No
- Synopsis: Optionally, run the redoer to process "redo records"
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: Yes
- Synopsis:
Optionally, run the
sshd
container that allows
ssh
andscp
access. Can be used for debugging, copying files to the EFS, or the Senzing Exploratory Tools. This is an economical container. To run a "maxed-out" container, see RunSshdMax. - Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: Yes
- Synopsis:
Optionally, run the
sshd
container that allows
ssh
andscp
access. Can be used for debugging, copying files to the EFS, or the Senzing Exploratory Tools. This differs from RunSshd in that it has maximum resource of 30GB Memory, 4 vCPU. - Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: Yes
- Synopsis: Optionally, run the stream-loader' which reads records from the SQS queue and sends them to the Senzing Engine.
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: Yes
- Synopsis: Optionally, run the stream-producer container that fetches JSON lines from a file and pushes them to the SQS queue. If "Yes" is chosen, SenzingInputUrl, SenzingRecordMin, and SenzingRecordMax need to be specified.
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: Yes
- Synopsis: Optionally, run the swaggerapi/swagger-ui container that hosts the SwaggerUI for viewing the Senzing REST API OpenAPI document.
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: Yes
- Synopsis: Optionally, capture information about the IP traffic going to and from network interfaces in your VPC.
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Default: No
- References:
- Synopsis: Optionally, run the entity-search-web-app which gives a web-based representation of data stored in the Senzing data model.
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Example:
- Default: Yes
- Synopsis: Optionally, run the Senzing Xterm which gives a web-base terminal useful in running command line programs.
- Required: Yes
- Type: Boolean
- Allowed values: [ "Yes" | "No" ]
- Example:
- Default: Yes
-
Synopsis: The Senzing proof-of-concept AWS Cloudformation uses AWS Cognito for authentication, and HTTPS (using a self-signed certificate) for encrypted network traffic to expose services through a single, internet-facing AWS Elastic Load Balancer. With exception of the senzing/sshd container, no tasks in the AWS Elastic Container Service (ECS) have public IP addresses.
To enable additional security measures for the deployment in your specific environment, you'll need to consult with your AWS administrator. Examples of additional security measures:
- AWS Route53 with genuine X.509 certificate
- AWS Web Application Firewall (WAF)
- AWS Shield
- AWS Firewall Manager
- Amazon API Gateway
- Restrictive value for CidrInbound
-
Required: Yes
-
Type: String
-
Allowed values:
- "I AGREE"
-
Default: None
- Synopsis:
If using RunStreamProducer, supply the
DATA_SOURCE
value to be used. - Required: Yes
- Type: String
- Default:
TEST
- Synopsis:
If using RunStreamProducer, supply the
ENTITY_TYPE
value to be used. - Required: Yes
- Type: String
- Default:
GENERIC
- Synopsis: If using RunStreamProducer, supply the URL of a tar-gzipped file in JSON-lines format containing records to ingest into Senzing.
- Required: Yes if running Stream Producer, otherwise no.
- Type: String
- Allowed pattern: A URL starting with
http://
orhttps://
. - Example:
https://www.example.com/my/records.json
- Default:
https://s3.amazonaws.com/public-read-access/TestDataSets/SenzingTruthSet/truth-set.json
-
Synopsis: To ingest more than 100,000 records, a Senzing license is required. A binary version of the Senzing license,
g2.lic
, is not usable as a parameter in the text entry field. Instead, a Base64 representation of the information is needed. An example of how to produce base64 fromg2.lic
on Linux and macOS:base64 /opt/senzing/etc/g2.lic
Copy the entire output from the command and paste into the text entry field.
-
Required: Yes if ingesting more than 100,000 records, otherwise no.
-
Type: String
-
Allowed pattern: Empty or Base64 characters. Specifically
^$|[^-A-Za-z0-9+/=]|=[^=]|={3,}$
-
Allowed values: Base64 encoded string
-
Example:
AQAAADgCAAAAAAAAU2VuemluZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARGVtbyBFeHBpcmVkAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADIwMjAtMTItMTYA AAAAAAAAAAAARVZBTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAFNUQU5EQVJEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAKCGAQAAAAAAMTk3Ni0wMS0wMQAAAAAAAAAAAABN T05USExZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAARkdIST5XYOZ90kbyAbU7wM7XvPCwq/FgORZIekwFMg8zi3tCD0V5+12q72aqk0E6JOct +cPAq/T50N5Pf5nvJZ6TaW3TzQbnH/z5f/ALsWLydE2DPNvq3HuAjkjZpg2h7mb4OUqorGxDI9RX TX8hPjzYrBfMdOgl1DlRBVG36WwdpB8AnSfaegbYU+U/vfof+ff6mJk8gzPg+OGPwg21/S6i2TT4 RbTCSYP/TpfXyJGE6dbQWEC9rFhYuWq3mFF3z7zFEcmxpNfZuBtYsxni8P3sDZ706RA+wcQF7TVg giJoK03W8kd6mk3X+fvc4ARJo9RarYInsAvSHKlr1KpxeebuirfqgSz+uEW6pqOD1fV0oHnFncdf jV2k2CqmIfThB/ONQcn/4/EIlhdzXqxSlXAGz6C7ApHq6xUCdLILx/NfdUEypHIfyabrpXKOKOPx zekhGztEzB0gSJNebEa++EKxHDOc1Sc0YD9q9KvcaGSPTjlCJeaNhufg9Sz/iXZMP+d4Vkp+Bn6p mfUPG7tKharEoRChUNfRms8wVyNxmz6LRw5Uy14Dlodd0LyBQRB9Tx8FVYMh5AElwjbQOoDOIRvi IQIGsUNp/ZkP7PdBxc/b9o3rjUsZCzyCtP+jflZSqMenzXCsTI1Xay6On2wSVwQdJ1/2eIwKEfCF hj4DZlY5+jSo
-
Default: None
- Synopsis: When using SenzingInputUrl, this indicates the number of the last line that will be read from the file. It is used to limit the number of records ingested into Senzing.
- Required: Yes if using SenzingInputUrl, otherwise no.
- Type: Number
- Allowed pattern: Numbers. Specifically:
[0-9]*
- Allowed values: 0 = Read entire file; Any positive integer.
- Example: 15000000
- Default: 0
- Synopsis: When using SenzingInputUrl, this indicates the number of the first line that will be read from the file. Used to skip lines at the beginning of the file. It is handy if the beginning of the file has already been ingested into Senzing.
- Required: Yes if using SenzingInputUrl, otherwise no.
- Type: Number
- Allowed pattern: Numbers. Specifically:
[0-9]*
- Allowed values: 0 = Read from beginning; Any positive integer.
- Example: 100000
- Default: 0
- Synopsis: The version of Senzing installed onto the AWS Elastic File System. More information at Senzing API Version History.
- Synopsis: An alias for UrlWebApp. Since it's one of the first things to look at, it is listed first.
- Details: It is listed first because the name "cheats" and uses a zero instead of a capital "o".
- Synopsis: The AWS account ID used to create the AWS Cloudformation.
- Synopsis: Amazon Resource Name (ARN) of certificate used for SSL support. More information at AWS LoadBalancer Console. Select a load balancer, view the "Listeners" tab, then click "View/edit certificates".
- Synopsis: The hostname of the loadbalancer that is a proxy to all of the services.
- Details:
More information at AWS Load Balancers console.
Also used as the
host
value when using UrlSwagger.
- Synopsis: The queue to which records that are not able to be ingested into Senzing Engine are sent. In otherwords, if the JSON message is malformed, or Senzing denied inserting into the Senzing Engine.
- Details: More information at AWS SQS Console.
- Synopsis: The queue from which records are ingested into Senzing Engine. In otherwords, this is the queue where records are sent to be inserted into the Senzing Engine.
- Details: More information at AWS SQS Console.
- Synopsis: The queue that is populated with responses from inserting records into the Senzing Engine. This is commonly called "WithInfo" information.
- Details: More information at AWS SQS Console.
- Synopsis: The queue to which records that are not able to be ingested into Senzing Engine resoer are sent.
- Details: More information at AWS SQS Console.
- Synopsis:
The queue populated by the
redoer
with records the Senzing Engine identified as needing reevaluation. The queue will be consumed by the fleet ofredoers
that read from the queue and send to the Senzing Engine for reprocessing. The results will be sent to the QueueRedoerOutput. - Details: More information at AWS SQS Console.
- Synopsis:
The queue that is populated with responses from reprocessing records.
This is commonly called "WithInfo" information from the
redoer
. - Details: More information at AWS SQS Console.
- Synopsis: Password to be used when logging into the SSHD container.
- Synopsis: User ID to be used when logging into the SSHD container.
- Details: Usually "root".
- Synopsis: The first of two public subnets created.
- Details:
See the subnet having a Name in the form
{StackName}-ec2-subnet-public-1
in the AWS Virtual Private Cloud console.
- Synopsis: The second of two public subnets created.
- Details:
See the subnet having a Name in the form
{StackName}-ec2-subnet-public-2
in the AWS Virtual Private Cloud console.
- Synopsis: A URL showing how to reach the Senzing API Server directly.
- Synopsis:
A URL showing how to reach the
Senzing API Server
directly.
The
/heartbeat
URI path simply demonstrates that the API server is responding. For more URIs, see SwaggerUrl output value.
- Synopsis: A URL showing how to reach the Senzing Jupyter notebooks.
- Synopsis: A URL showing how to reach the Swagger User Interface.
- Usage:
To access the Senzing API server
- Using the URL, visit the
UrlSwagger
webpage. - In Servers
- From the drop-down, select
{protocol}://{host}:{port}{path}
. - protocol: https
- host: Enter the value of Host
- port: 443
- path: /api
- From the drop-down, select
- The HTTP URIs will now access the deployed Senzing API server.
- Using the URL, visit the
- Synopsis: A URL showing how to reach the Senzing Entity Search Web App.
- Synopsis: A URL showing how to reach the Senzing Xterm.
- Usage:
From this Linux terminal,
G2Command.py
,G2Explorer.py
,G2ConfigTool.py
, can be run.
- Synopsis: The one-time password for the UserName.
- Details: When the one-time password is used, the user is prompted for a new password. Once a new password is submitted, the one-time password has no value.
- Synopsis: The user name submitted for the CognitoAdminEmail. It is the initial user created to access the system.
- Details: To add users, see UserPool
- Synopsis: The specific UserPool URL. It can be used to add, manage, or delete users for this Cloudformation.