docs: Copy-edit Overview and extract Quickstart

scrapy · Jul 20, 2024 · 9dfda8c · 9dfda8c
1 parent 94d624f
commit 9dfda8c
Show file tree

Hide file tree

Showing 5 changed files with 62 additions and 89 deletions.
diff --git a/docs/config.rst b/docs/config.rst
@@ -203,6 +203,8 @@ Options
 
 .. attention:: It is not recommended to use a low interval like 0.1 when using the default :ref:`spiderqueue` value. Consider a custom queue based on `queuelib <https://github.com/scrapy/queuelib>`__.
 
+.. _config-launcher:
+
 Launcher options
 ----------------
 

diff --git a/docs/contributing.rst b/docs/contributing.rst
@@ -1,52 +1,37 @@
-.. _contributing:
-
 Contributing
 ============
 
-.. important:: Read through the `Scrapy Contribution Docs <http://scrapy.readthedocs.org/en/latest/contributing.html>`__ for tips relating to writing patches, reporting bugs, and project coding style.
-
-These docs describe how to setup and contribute to Scrapyd.
+.. important:: Read through the `Scrapy Contribution Docs <http://scrapy.readthedocs.org/en/latest/contributing.html>`__ for tips relating to writing patches, reporting bugs, and coding style.
 
-Reporting issues & bugs
------------------------
+Issues and bugs
+---------------
 
-Issues should be reported to the Scrapyd project `issue tracker <https://github.com/scrapy/scrapyd/issues>`__ on GitHub.
+Report on `GitHub <https://github.com/scrapy/scrapyd/issues>`__.
 
 Tests
 -----
 
-Tests are implemented using the `Twisted unit-testing framework <https://docs.twisted.org/en/stable/development/test-standard.html>`__. Scrapyd uses ``trial`` as the test running application.
-
-Running tests
--------------
+Include tests in your pull requests.
 
-To run all tests go to the root directory of the Scrapyd source code and run:
+To run unit tests:
 
 .. code-block:: shell
 
-   trial tests
+   pytest tests
 
-To run a specific test (say ``tests/test_poller.py``) use:
+To run integration tests:
 
 .. code-block:: shell
 
-   trial tests.test_poller
-
-Writing tests
--------------
-
-All functionality (including new features and bug fixes) should include a test
-case to check that it works as expected, so please include tests for your
-patches if you want them to get accepted sooner.
-
-Scrapyd uses unit tests, which are located in the `tests <https://github.com/scrapy/scrapyd/tree/master/tests>`__ directory.
-Their module name typically resembles the full path of the module they're testing.
-For example, the scheduler code is in ``scrapyd.scheduler`` and its unit tests are in ``tests/test_scheduler.py``.
+   printf "[scrapyd]\nusername = hello12345\npassword = 67890world\n" > scrapyd.conf
+   mkdir logs
+   scrapyd &
+   pytest integration_tests
 
-Installing locally
-------------------
+Installation
+------------
 
-To install a locally edited version of Scrapyd onto the system to use and test, inside the project root run:
+To install an editable version for development, clone the repository, change to its directory, and run:
 
 .. code-block:: shell
 

diff --git a/docs/deploy.rst b/docs/deploy.rst
@@ -1,11 +1,6 @@
 Deployment
 ==========
 
-Deploying a Scrapy project
---------------------------
-
-This involves building a `Python egg <https://setuptools.pypa.io/en/latest/deprecated/python_eggs.html>`__ and uploading it to Scrapyd via the `addversion.json <https://scrapyd.readthedocs.org/en/latest/api.html#addversion-json>`_ webservice. Do this easily with the `scrapyd-deploy` command from the `scrapyd-client <https://github.com/scrapy/scrapyd-client>`__ package.
-
 .. _docker:
 
 Creating a Docker image

diff --git a/docs/index.rst b/docs/index.rst
@@ -1,22 +1,56 @@
+=================
 Scrapyd |release|
 =================
 
 .. include:: ../README.rst
 
-Installation
-------------
+Quickstart
+==========
+
+Install Scrapyd
+---------------
 
 .. code-block:: shell
 
    pip install scrapyd
 
+Start Scrapyd
+-------------
+
+.. code-block:: shell
+
+   scrapyd
+
+See :doc:`overview` and :doc:`config` for more details.
+
+Upload a project
+----------------
+
+This involves building a `Python egg <https://setuptools.pypa.io/en/latest/deprecated/python_eggs.html>`__ and uploading it to Scrapyd via the `addversion.json <https://scrapyd.readthedocs.org/en/latest/api.html#addversion-json>`_ webservice.
+
+Do this easily with the `scrapyd-deploy` command from the `scrapyd-client <https://github.com/scrapy/scrapyd-client>`__ package. Once configured:
+
+.. code-block:: shell
+
+   scrapyd-deploy
+
+Schedule a crawl
+----------------
+
+.. code-block:: shell-session
+
+   $ curl http://localhost:6800/schedule.json -d project=myproject -d spider=spider2
+   {"status": "ok", "jobid": "26d1b1a6d6f111e0be5c001e648c57f8"}
+
+See :doc:`api` for more details.
+
 .. toctree::
    :maxdepth: 2
    :caption: Contents
 
    overview
    config
-   deploy
    api
+   deploy
    contributing
    news
diff --git a/docs/overview.rst b/docs/overview.rst
@@ -5,74 +5,31 @@ Overview
 Projects and versions
 =====================
 
-Scrapyd can manage multiple projects and each project can have multiple
-versions uploaded, but only the latest one will be used for launching new
-spiders.
+Scrapyd can manage multiple Scrapy projects. Each project can have multiple versions. The latest version is used by default for starting spiders.
 
-A common (and useful) convention to use for the version name is the revision
-number of the version control tool you're using to track your Scrapy project
-code. For example: ``r23``. The versions are not compared alphabetically but
-using a smarter algorithm (the same `packaging <https://pypi.org/project/packaging/>`__ uses) so ``r10`` compares
-greater to ``r9``, for example.
+The latest version is the alphabetically greatest, unless all version names are `version specifiers <https://packaging.python.org/en/latest/specifications/version-specifiers/>`__ like ``1.0`` or ``1.0rc1``, in which case they are sorted as such.
 
 How Scrapyd works
 =================
 
-Scrapyd is an application (typically run as a daemon) that listens to requests
-for spiders to run and spawns a process for each one, which basically
-executes:
+Scrapyd is a server (typically run as a daemon) that listens for :doc:`api` and :ref:`webui` requests.
 
-.. code-block:: shell
-
-   scrapy crawl myspider
-
-Scrapyd also runs multiple processes in parallel, allocating them in a fixed
-number of slots given by the :ref:`max_proc` and :ref:`max_proc_per_cpu` options,
-starting as many processes as possible to handle the load.
-
-In addition to dispatching and managing processes, Scrapyd provides a
-:doc:`api` to upload new project versions
-(as eggs) and schedule spiders. This feature is optional and can be disabled if
-you want to implement your own custom Scrapyd. The components are pluggable and
-can be changed, if you're familiar with the `Twisted Application Framework <https://docs.twisted.org/en/stable/core/howto/application.html>`__
-which Scrapyd is implemented in.
-
-Starting from 0.11, Scrapyd also provides a minimal :ref:`web interface
-<webui>`.
-
-Starting Scrapyd
-================
-
-To start the service, use the ``scrapyd`` command provided in the Scrapy
-distribution:
+The API is especially used to upload projects and schedule crawls. To start a crawl, Scrapyd spawns a process that essentially runs:
 
 .. code-block:: shell
 
-   scrapyd
-
-That should get your Scrapyd started.
-
-Scheduling a spider run
-=======================
-
-To schedule a spider run:
-
-.. code-block:: shell-session
+   scrapy crawl myspider
 
-   $ curl http://localhost:6800/schedule.json -d project=myproject -d spider=spider2
-   {"status": "ok", "jobid": "26d1b1a6d6f111e0be5c001e648c57f8"}
+Scrapyd runs multiple processes in parallel, and manages the number of concurrent processes. See :ref:`config-launcher` for details.
 
-For more resources see: :doc:`api` for more available resources.
+If you are familiar with the `Twisted Application Framework <https://docs.twisted.org/en/stable/core/howto/application.html>`__, you can essentially reconfigure every part of Scrapyd. See :doc:`config` for details.
 
 .. _webui:
 
 Web interface
 =============
 
-Scrapyd comes with a minimal web interface (for monitoring running processes
-and accessing logs) which can be accessed at http://localhost:6800/
-
-Other options to manage your Scrapyd cluster include:
+Scrapyd has a minimal web interface for monitoring running processes and accessing log files and item fees. By default, is is available at at http://localhost:6800/ Other options to manage Scrapyd include:
 
 -  `ScrapydWeb <https://github.com/my8100/scrapydweb>`__
 -  `spider-admin-pro <https://github.com/mouday/spider-admin-pro>`__
-Original file line number
+Diff line change
@@ Expand Up / @@ -203,6 +203,8 @@ Options @@
     .. attention:: It is not recommended to use a low interval like 0.1 when using the default :ref:`spiderqueue` value. Consider a custom queue based on `queuelib <https://github.com/scrapy/queuelib>`__.
+    .. _config-launcher:
     Launcher options
     ----------------
@@ Expand Down @@