DevCenter Admin

This is a Node.js Web App to be used to manage the database of articles that feed the CDS "How-tos" section, Bluemix contextual search amongst others. It provides the following functionality:

a NoSQL database of article URLS and their meta data
a search API to allow third party systems to search the database
a Content Management System web UI to allow administrators to
- add/edit/delete content
- edit the schema
an API that allows a third party system to suggest URLs to be added to the database
an API used to allow Slack 'slash command' to be connect to suggest URLs to be added to the database

Configuration

This a Bluemix app and expects an attached Cloudant NoSQL database service to be attached called "devcenter-cloudant".

You can run this app locally by setting appropriate environment variables to simulate a Bluemix environment e.g

export VCAP_SERVICES='{"cloudantNoSQLDB":[{"name":"devcenter-cloudant","label":"cloudantNoSQLDB","plan":"Shared","credentials":{"username":"myusername","password":"mypassword","host":"myhost.cloudant.com","port":443,"url":"https://myusername:[email protected]"}}]}'
export PASSWORD=1f2dd4437feca264b4ba93c2b4e71667bcd32e2d
export SLACK_TOKEN=xPxB8e1031GOh2oJLoofvoEi
export API_KEYS=7c9fdffca0d003646a9f1fdf591f29a9
export VCAP_APP_HOST=http://localhost:6007/
node app.js

Data

In it's simplest form, an item in the database looks like this:

{  
{
  "_id": "069f0fa6b0247d5bb843f32d6d44c608b925d798",
  "_rev": "1-3ea7e27db22ae8ba7e1989c82359ff7d",
  "name": "Document title",
  "full_name": "A longer document title",
  "description": "this is a longer form description of the document",
  "body": "this is the document body",
  "url": "https://full.url.com/",
  "created_at": "2015-09-11 09:45:34 +00:00",
  "updated_at": "2015-09-11 09:45:34 +00:00",
  "imageurl": "https://developer.ibm.com/bluemix/wp-content/uploads/sites/20/2015/09/loopback-hero.png",
  "status": "Provisional",
}

The documents are formatted in JSON with the _id and _rev are the unique identifier and revision token from the Cloudant database. The other fields are self explanatory; some of them are automatically populated when a document is created.

The _id is autoamatically calculated when a document is created and is a hash of the URL. This ensures that two documents with the same URL cannot be stored in the database. Because of this, a documents URL should not be changed after its creation.

It is important to ensure that the created_at and updated_at fields are in "YYYY-MM-DD" format, so that sort ordering will be in the correct order.

The status field can take one of three values:

Provisional - the status of a document prior to publication
Live - a published document
Deleted - a document that is no longer required.

Only 'Live' documents are indexed for searching.

Schema

As well as the core fields of a document, any number of custom fields can be added. The configuration of the custom fields is stored in a special document whose _id is 'schema'.

Each key of the schema object (apart from _id and _rev) represents a field that can be used in our documents. The following fields are used to configure the schema:

field	Description	e.g.
type	the data type	"string", "boolean", "number", "arrayofstrings"
enforceValues	whether values are enforced	true, false
values	array of possible values	["red", "green", "blue"]
faceted	whether to facet or not	true, false

Simple string/boolean/number values, can be simply expressed with:

  "otherurl": {
    "type": "string"
  },

If a field's value is to picked from set of fixed values, then the list of values can be recorded in the schema. The 'level' field has such a definition:

"level": {
    "type": "string",
    "enforceValues": true,
    "values": [
      "Beginner",
      "Intermediate",
      "Advanced"
    ],
    "faceted": true
  }

It is a a "string" whose values are enforced and so can only take one of "Beginner", "Intermediate" or "Advanced".

The 'arrayofstrings' type is useful for creating fields that can take multiple values. The 'namespace' field has the following configuration:

  "namespace": {
    "type": "arrayofstrings",
    "enforceValues": true,
    "values": [
      "Bluemix",
      "Cloud Data Services",
      "Mobile First"
    ],
    "faceted": true
  }

This states that 'namespace' is an array of strings whose values can only be one or more of "Bluemix", "Cloud Data Services" and "Mobile First".

A typical full schema document looks like this:

{
  "_id": "schema",
  "_rev": "2-f9b34956c87b341054ca7e0f47d2ff4c",
  "languages": {
    "type": "arrayofstrings",
    "enforceValues": true,
    "values": [
      "Ruby",
      "Python",
      "Java",
      "JavaScript",
      "PHP",
      "Objective-C",
      "C#",
      "Swift",
      "C"
    ],
    "faceted": true
  },
  "technologies": {
    "type": "arrayofstrings",
    "enforceValues": true,
    "values": [
      "Bluemix",
      "Cloud Foundry",
      "Virtual machines",
      "Containers",
      "Cloudant",
      "Cloudant Local",
      "Cordova",
      "CouchDB",
      "DataWorks",
      "Elasticsearch",
      "Graph Data Store",
      "Ionic",
      "Looker",
      "MobileFirst",
      "MongoDB",
      "OAuth",
      "PhoneGap",
      "PostgreSQL",
      "PouchDB",
      "Redis",
      "Salesforce",
      "Spark",
      "WebSockets",
      "dashDB"
    ],
    "faceted": true
  },
  "topic": {
    "type": "string",
    "enforceValues": true,
    "values": [
      "Article",
      "Tutorial",
      "Video",
      "Sample",
      "API",
      "Blog",
      "Forum"
    ],
    "faceted": true
  },
  "featured": {
    "type": "boolean",
    "faceted": true
  },
  "demourl": {
    "type": "string"
  },
  "githuburl": {
    "type": "string"
  },
  "videourl": {
    "type": "string"
  },
  "documentationurl": {
    "type": "string"
  },
  "author": {
    "type": "string"
  },
  "otherurl": {
    "type": "string"
  },
  "level": {
    "type": "string",
    "enforceValues": true,
    "values": [
      "Beginner",
      "Intermediate",
      "Advanced"
    ],
    "faceted": true
  },
  "type": {
    "type": "string",
    "enforceValues": true,
    "values": [
      "Article",
      "Tutorial",
      "Video"
    ]
  },
  "namespace": {
    "type": "arrayofstrings",
    "enforceValues": true,
    "values": [
      "Bluemix",
      "Cloud Data Services",
      "Mobile First"
    ],
    "faceted": true
  }
}

Faceting

All fields in the schema are indexed for search, but only fields marked faceted: true are faceted. A faceted field is one whose values are counted in a result set to give the user insight into the make up of the result set. In the search results, facet counts appear like this:

"counts": {
        "namespace": {
            "Bluemix": 5,
            "Cloud Data Services": 2,
            "Mobile First": 2
        },
        "level": {
            "Advanced": 1,
            "Intermediate": 1,
            "Beginner": 8
        }
}

Faceting is useful for fields that have repeating values within the data set (e.g. brand, colour, category, country etc). Faceting is entirely unsuitable for for fields which are unique within the data set (e.g. ids, description, url).

A good rule of thumb is that faceting should be enabled for fields whose values are enforced and come for a fixed set of possibilities.

API

Third-party systems can add articles to the database by POSTING to the /api/submit endpoint passing

url - the URL of the article to submit
token - the api key used to identify valid API calls
namespace - the namespace used to tag the document

where the token must be one of the keys listed in the API_KEYS environment variable.

e.g.

curl -X POST -d 'url=http%3A%2F%2Fmyblog.com%2Fpost%2F1&token=abc123&namespace=Cloud+Data+Services' 'https://mydevcenter.mybluemix.net/api/submit'

A successful API call will return a JSON message such as:

{
    "ok": true,
    "msg": "Thanks for submitting http://myblog.com/post/1. The URL will be published after it is reviewed by a human.",
    "id": "d0cfae70c5b7f00b0e858ee6c7b99be73d6bd60d"
}

If a URL is already in the database, the response will look like:

{
    "ok": false,
    "msg": "There was an error :( Error: Document update conflict."
}

It is pretty easy to hook up to the API using a tool like Node-RED. First we create a "sub flow" that expects an input from a Feed Parse node.

It will:

converts the Feed Parse output into the form required to HTTP POST
HTTP POST the data to the /api/submit endpoint

The function's source code is as follows:

var link = msg.article.link;
var namespace = msg.namespace || "";
var msg = {};
msg.headers={"Content-type": "application/x-www-form-urlencoded"};
msg.payload= "token=03646a9f1fdf7c9fdffca0d0591f29a9";
msg.payload+= "&url=" + encodeURIComponent(link);
msg.payload+= "&namespace=" + encodeURIComponent(namespace);
return msg;

We can the hook up several feeds to this sub-flow:

We hook up each feed to the sub flow, passin the data through functions which add a custom namespace parameter:

msg.namespace = "Cloud Data Services";
return msg;

Slack API

The Slack API (POST /slack) is almost identical to the /api/submit call but the URL is passed in the text parameter and the token field is validated against the SLACK_TOKEN environment variable.

e.g.

curl -X POST -d 'text=http%3A%2F%2Fmyblog.com%2Fpost%2F1&token=abc123' 'https://mydevcenter.mybluemix.net/slack'

This allows a Slack "slash command" to be setup to point to this app's '/slack' endpoint to allow your slack users to suggest content by typing

/customslackcommand http://myblog.com/post/1

in their Slack app or web interface, where customslackcommand is string defined in your Slack account.

Web UI

This app presents a web app that provides the following features:

new articles to be added by completing a form
provisional articles to be suggested by supplying just the URL of the article
editing of any article
editing of the schema

N.B. Only articles who have a status field with value 'Live' are indexed for search. The only way to edit an article's status is using the Web UI. This is deliberate to ensure that there is human oversight before content is published.

Search API

Under the hood, the articles data is stored in an IBM Cloudant database. Cloudant is a NoSQL database offerring an HTTP API and can be queried using MapReduce and using Lucene indexes. When the app is installed, a database called 'devcenter' is created and indexes are created to allow

articles to be queried by status for the web UI menu page, using a MapReduce view
articles to be queried using the Lucene query language using Cloudant Search

Both indexes are created automatically; the Lucene index is updated every time the schema is changed, to ensure that all fields are correctly indexed.

To query the search index, locate the domain name of the Cloudant service that is associated with your Bluemix app. This can be found in the Bluemix dashboard and will look something like this:

https://9582d32-634c-4ade-83fe-966ea8a8d8c7-bluemix.cloudant.com

The URL of the search API is calculated by adding /devcenter/_design/search/_search/search to the end of the Cloudant URL.

The following fields are then used for querying:

field	Description	e.g.
q	the search query	`:`
include_docs	whether to return the document body	`true`
counts	list of faceted fields to count	`["level", "namespace"]`
limit	the number of results to return	`10`
sort	override the sort order	`"-date"`

Some hints:

*:* means 'match everything'
fielded queries follow this pattern level:beginner
logic can be performed too level:beginner AND namespace:Bluemix
free-text search q=pouchdb replication javascript
combine free-text and fielded search q=pouchdb+replication+javascript+AND+namespace:Bluemix
you may only do counts on fields that were marked as 'facet:true' in the schema
you probably want include_docs=true

Full Cloudant Search documentation is [https://docs.cloudant.com/search.html][here].

Sorting

By default, the sort order is 'best match' first. This can be overriden but Cloudant search limits sorting to numeric fields, or string fields that are not 'analyzed'. To this end a date field is automatically indexed for your convenience that can be used to sort the result set in date order (sort="date") or reverse date order (sort="-date").

Sorting of boolean fields can be acheived by doing sort="featured<string>" (false first) or sort="featured<string>" (true first).

Environment variables

This app can be configured within Bluemix using the following set of environment variables, as well as VCAP_SERVICES which specifies the attached Bluemix services.

field	Description	e.g.
PASSWORD	the password used to login to the web UI	mypassword
SLACK_TOKEN	list of valid Slack tokens	abc123,def456
API_KEYS	list of valid API keys	abc123,def456
CLOUDANT_HOST	the URL of the Cloudant service	https://U:[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
ddocs		ddocs
lib		lib
public		public
test		test
views		views
.cfignore		.cfignore
.gitignore		.gitignore
README.md		README.md
app.js		app.js
manifest.yml		manifest.yml
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DevCenter Admin

Configuration

Data

Schema

Faceting

API

Slack API

Web UI

Search API

Sorting

Environment variables

About

Releases

Packages

Languages

ibm-watson-data-lab/devcenter

Folders and files

Latest commit

History

Repository files navigation

DevCenter Admin

Configuration

Data

Schema

Faceting

API

Slack API

Web UI

Search API

Sorting

Environment variables

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages