Skip to content

Latest commit

 

History

History
141 lines (92 loc) · 3.38 KB

README.md

File metadata and controls

141 lines (92 loc) · 3.38 KB

Humanize contractions

This package tries the impossible, to revive English language contractions after they have been destroyed by dehumanizing process like camel-casing, kebab-casing, lower-casing, snakes-casing, or such. It's not an exact science and you can’t ever get 100% results, but maybe 80% results... on a good day.

How it works

It’s just a simple replace basically, no dark arts of any sort. It works on an array of objects like this:

const contractions = [
  { word: 'I’m', simplified: 'im', isReplaceable: true },
  { word: 'she’ll', simplified: 'shell', isReplaceable: false },
  ...
}

Some words are replaceable, some are not, because they have another meaning, like "shell", for example. And that’s the crux of this operation: it’s imperfect in its core. But it provides a lot of help in the process of reverse engineering destroyed strings.

Examples

Simple example:

const humanized = humanizeContractions('im having a bad day')
// I'm having a bad day

It doesn’t replace words that have a meaning:

const humanized = humanizeContractions('shed be mad')
// shed be mad

Unless you know your data-set and enable the brutalMode:

const humanized = humanizeContractions('shed be mad', { brutalMode: true })
// she’d be mad

If the brutalMode is too brutal, individual words can be included. Maybe the contraction "she’d" appears a lot in a given data-set and the word "shed" never:

const humanized = humanizeContractions('shed be mad', { include: ['shed'] })
// she’d be mad

If you have special data-set with replaceable words that have a meaning, you can exclude them:

const humanized = humanizeContractions('the race at im was great', {
  // "IM" is referring to Isle of Man in this case.
  exclude: ['im']
})
// the race at im was great

Or you can tweak it more by combining includes and excludes to suit your data:

const humanized = humanizeContractions('the race at im was great id say', {
  exclude: ['im'],
  include: ['id']
})
// the race at im was great I’d say

Words can be excluded when in brutalMode:

const expected = humanizeContractions('shell be doing some work in the shed', {
  brutalMode: true,
  exclude: ['shed']
})
// she’ll be doing some work in the shed

Access the list of contractions

The module exports it:

import { contractions } from 'humanize-contractions'

About humanizing the string

This package expects phrases where words are separated by spaces, so you have to humanize the string you feed it, if it happens to be, for example, camel-cased:

import humanize from 'humanize-string'

const input = humanize('im-having-a-bad-day')

const stringWithContractions = humanizeContractions(input)
// I'm having a bad day

Find humanize package from npm that suits you:

API

humanizeContractions(phrase[, options])

Returns the humanized string.

phrase

Type: string

Words separated by spaces.

options?

options.brutalMode

Type: boolean

Should it obey the isReplaceable hint.

options.exclude

Type: array

Replaceable words to exclude.

options.include

Type: array

Irreplaceable words to include.

humanizeContractions.contractions

Type: array

List of the contractions.