Skip to content
Falk Amelung edited this page Sep 3, 2019 · 6 revisions

Note: This is a work in progress. Comments and suggestions can be discussed in this issue.

RSMAS Style Guide

Unless otherwise stated, we follow PEP 8 guidelines and Google Guidelines. We will focus mostly on RSMAS-specific coding guidelines in this doc and therefore will comment less on general Python style. The Google Guidelines are a good prerequisite to reading this doc (and great to learn from!).

General Style Patterns

  1. File name should always have extensions.
  2. Use the following string formatting pattern: "{} {}”.format(var_0, var_1) when building strings.
  3. Functions should be iterative by default and avoid using recursion except where absolutely necessary.
  4. Import entire modules rather than specific functions from a given module: import _process_utilities as putils
  5. Do not explicitly return None unless prematurely exiting a function. Python does it for you.
  6. When parsing files for text, add an example to the function documentation of what text is being looked for.
  7. Only use try-except blocks when raising Exceptions. Do not use them to handle KeyExceptions
  8. Do not use eval() and/or sys.exit(0)
  9. When returning multiple items from a function, return a tuple() object rather than a list.
  10. Place all import statements at the top of the file, even if they're only needed in one place.
  11. Constant parameters should be declared at the top of the file in all UPPERCASE .

Naming Practices

  • module_name
  • package_name
  • ClassName
  • method_name
  • ExceptionName
  • function_name
  • GLOBAL_CONSTANT_NAME
  • global_var_name
  • instance_var_name
  • function_parameter_name
  • local_var_name.

As you can see, use_underscores_to_separate_words_in_most_cases. Function names, variable names, and filenames should be descriptive and avoid abbreviating. In particular, do not use abbreviations that are ambiguous or unfamiliar to readers outside your project, and do not abbreviate by deleting letters within a word.

As a consequence, we should observe the following:

NO:   dest='startStep', 
YES:  dest='start_step'

NO:   '--remove_project_dir', dest='remove_project_dir',
YES:  '--removeProjectDir', dest='remove_project_dir'

NO:   self.customTemplateFile = inps.customTemplateFile
YES:  self.custom_template_file = inps.custom_template_file

Use of the subprocess and os Module

Parsing of Command Line Arguments

Logging to the Console and to Files

Whenever persistent logging is required for certain script, utilize the custom rsmas_logging.RsmasLogger() object. This object takes a filename as a parameter, and handles all of the setup of both a StreamHandler object, which will write logs to stdout and and a FileHandler, which will write logs to the provided file. The rsmas_logging.RsmasLogger() object is a wrapper class around the standard python logging module, and simply handles setup of the logging.logger object and its associated handlers.

Intended use of the rsmas_logging.RsmasLogger() object is as follows:

    from rsmas_logging import RsmasLogger, log_level
    
    log_file = "logfile.log"
    logger = RsmasLogger(log_file)
    logger.log(loglevel.INFO, "A log message that will be written to 'logfile.log' and `stdout`."

A part of the logging module and, thus, part of the rsmas_logging module, is the ability to specify different "loglevels" which makes searching for occurrence of certain events within .log file easier. The implemented log levels are:

  • loglevel.DEBUG - simple debugging statements such as the number of iterations of a given loop
  • loglevel.INFO - informational statements such as that a process has complete successfully
  • loglevel.WARNING - warning statements such as that a process has failed or something unexpected has occurred
  • loglevel.ERROR - error statements such as that an exception was thrown or erroneous inputs were provided
  • loglevel.CRITICAL - critical warning statements such as that a process has failed and exited the program entirely

Due to the implementation of the root logging.logger object, note that all RsmasLogger() objects that are created during the execution of a given program share the same root logger object, which can lead to unexpected logging outputs, often seen as duplicated or even triplicated output. The RsmasLogger() implementation does attempt to handle the case where multiple handlers of the same type are setup and actively prevents such things from happening, but it has not been thoroughly tested.

Accessing Template Options

Whenever access to options that stored within a template file is necessary, utilize the custom dataset_template.Template() object. This object takes a template file as a parameter, which is usually passed along via a command line argument, and creates a dictionary of all the template options, which can be accessed using python's standard dictionary access routines: dict[key]

Intended use of the dataset_template.Template() object is as follows:

    from dataset_template import Template
    
    template_file = "template_file_name.template"
    dataset_template = Template(template_file)
    template_options = dataset_template.get_options()

    template_options['dataset']            # the dataset name of the template
    template_options['ssaraopt.platform']  # the ssaraopt.platform from the template  

dataset_template.Template() also contains a few helper methods that provide easier access to certain commonly used properties, such as the dataset name, as well as generation methods, to generate useful data structures from the options dictionary values. See: Template.get_dataset_name() and Template.generate_ssaraopt_string().

File Structure Manipulations

Documentation

All functions, classes, objects, and methods should be thoroughly documented PRIOR to being merged into the master branch.

Function and method documentation should follow PEP257 documentation format, and include the following information at a minimum:

  • What the function or method is doing
  • Why the function of method is needed and where it is/will be used
  • Parameters being passed, including their types
  • The return value of the function if any

Example:

def get_dataset_name(self):
    """ Provides quick access to the dataset property of the options dictionary.
        Should be used to quickly access the dataset name when directories require the dataset name
    """
    return self.options['dataset']

Class and object documentation should follow the same format as the function and method documentation but also include an intended use section with an example use case of the class/object.

Example:

class Template:
    """ Template object encapsulates a dictionary of template options.

        Given a dataset.template file, this object creates a dictionary of options keyed on the option name. This
        will allow for accessing of various template parameters in constant time (as opposed to O(n) time) and includes
        a simple, user friendly syntax.

        Use as follows:

            template = Template(file_name)                                  # create a template object
            options = template.get_options()                                # access the options dictionary
            dataset = options['dataset']                                    # access a specific option
            options = template.update_options(default_template_file)        # access a specific option

    """
    
    def __init__(self, custom_template_file):
        """ Initializes Template object with a custom template file.
        ...

Within classes, objects, functions, and methods, logical blocks of code should include brief comments on their precise functionality, especially in cases where the implementation is non-trivial, or where file handling is taking place (this is especially important for those of us who don't touch the actual data file hierarchy very much and thus aren't as familiar with the file tree).