Skip to content

Commit f814ebd

Browse files
committed
Deployed 9e1473c with MkDocs version: 1.6.1
1 parent 5f07660 commit f814ebd

File tree

41 files changed

+303
-303
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+303
-303
lines changed

assets/Logo.png assets/logo.png

File renamed without changes.

core-concepts/classification/basic/index.html

+9-9
Original file line numberDiff line numberDiff line change
@@ -1955,8 +1955,8 @@ <h1 id="basic-classification">Basic Classification<a class="headerlink" href="#b
19551955
<p>When classifying documents, the process involves extracting the content of the document and adding it to the prompt with several possible classifications. ExtractThinker simplifies this process using Pydantic models and instructor.</p>
19561956
<h2 id="simple-classification">Simple Classification<a class="headerlink" href="#simple-classification" title="Permanent link">&para;</a></h2>
19571957
<p>The most straightforward way to classify documents:</p>
1958-
<div class="highlight"><pre><span></span><code><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="kn">from</span><span class="w"> </span><span class="nn">extract_thinker</span><span class="w"> </span><span class="kn">import</span> <span class="n">Classification</span><span class="p">,</span> <span class="n">Extractor</span>
1959-
<a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="kn">from</span><span class="w"> </span><span class="nn">extract_thinker.document_loader</span><span class="w"> </span><span class="kn">import</span> <span class="n">DocumentLoaderTesseract</span>
1958+
<div class="highlight"><pre><span></span><code><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="kn">from</span> <span class="nn">extract_thinker</span> <span class="kn">import</span> <span class="n">Classification</span><span class="p">,</span> <span class="n">Extractor</span>
1959+
<a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="kn">from</span> <span class="nn">extract_thinker.document_loader</span> <span class="kn">import</span> <span class="n">DocumentLoaderTesseract</span>
19601960
<a id="__codelineno-0-3" name="__codelineno-0-3" href="#__codelineno-0-3"></a>
19611961
<a id="__codelineno-0-4" name="__codelineno-0-4" href="#__codelineno-0-4"></a><span class="c1"># Define classifications</span>
19621962
<a id="__codelineno-0-5" name="__codelineno-0-5" href="#__codelineno-0-5"></a><span class="n">classifications</span> <span class="o">=</span> <span class="p">[</span>
@@ -1984,27 +1984,27 @@ <h2 id="simple-classification">Simple Classification<a class="headerlink" href="
19841984
</code></pre></div>
19851985
<h2 id="type-mapping-with-contract">Type Mapping with Contract<a class="headerlink" href="#type-mapping-with-contract" title="Permanent link">&para;</a></h2>
19861986
<p>Adding contract structure to the classification improves accuracy:</p>
1987-
<div class="highlight"><pre><span></span><code><a id="__codelineno-1-1" name="__codelineno-1-1" href="#__codelineno-1-1"></a><span class="kn">from</span><span class="w"> </span><span class="nn">typing</span><span class="w"> </span><span class="kn">import</span> <span class="n">List</span>
1988-
<a id="__codelineno-1-2" name="__codelineno-1-2" href="#__codelineno-1-2"></a><span class="kn">from</span><span class="w"> </span><span class="nn">extract_thinker.models.contract</span><span class="w"> </span><span class="kn">import</span> <span class="n">Contract</span>
1987+
<div class="highlight"><pre><span></span><code><a id="__codelineno-1-1" name="__codelineno-1-1" href="#__codelineno-1-1"></a><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span>
1988+
<a id="__codelineno-1-2" name="__codelineno-1-2" href="#__codelineno-1-2"></a><span class="kn">from</span> <span class="nn">extract_thinker.models.contract</span> <span class="kn">import</span> <span class="n">Contract</span>
19891989
<a id="__codelineno-1-3" name="__codelineno-1-3" href="#__codelineno-1-3"></a>
1990-
<a id="__codelineno-1-4" name="__codelineno-1-4" href="#__codelineno-1-4"></a><span class="k">class</span><span class="w"> </span><span class="nc">InvoiceContract</span><span class="p">(</span><span class="n">Contract</span><span class="p">):</span>
1990+
<a id="__codelineno-1-4" name="__codelineno-1-4" href="#__codelineno-1-4"></a><span class="k">class</span> <span class="nc">InvoiceContract</span><span class="p">(</span><span class="n">Contract</span><span class="p">):</span>
19911991
<a id="__codelineno-1-5" name="__codelineno-1-5" href="#__codelineno-1-5"></a> <span class="n">invoice_number</span><span class="p">:</span> <span class="nb">str</span>
19921992
<a id="__codelineno-1-6" name="__codelineno-1-6" href="#__codelineno-1-6"></a> <span class="n">invoice_date</span><span class="p">:</span> <span class="nb">str</span>
19931993
<a id="__codelineno-1-7" name="__codelineno-1-7" href="#__codelineno-1-7"></a> <span class="n">lines</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">LineItem</span><span class="p">]</span>
19941994
<a id="__codelineno-1-8" name="__codelineno-1-8" href="#__codelineno-1-8"></a> <span class="n">total_amount</span><span class="p">:</span> <span class="nb">float</span>
19951995
<a id="__codelineno-1-9" name="__codelineno-1-9" href="#__codelineno-1-9"></a>
1996-
<a id="__codelineno-1-10" name="__codelineno-1-10" href="#__codelineno-1-10"></a><span class="k">class</span><span class="w"> </span><span class="nc">DriverLicense</span><span class="p">(</span><span class="n">Contract</span><span class="p">):</span>
1996+
<a id="__codelineno-1-10" name="__codelineno-1-10" href="#__codelineno-1-10"></a><span class="k">class</span> <span class="nc">DriverLicense</span><span class="p">(</span><span class="n">Contract</span><span class="p">):</span>
19971997
<a id="__codelineno-1-11" name="__codelineno-1-11" href="#__codelineno-1-11"></a> <span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
19981998
<a id="__codelineno-1-12" name="__codelineno-1-12" href="#__codelineno-1-12"></a> <span class="n">age</span><span class="p">:</span> <span class="nb">int</span>
19991999
<a id="__codelineno-1-13" name="__codelineno-1-13" href="#__codelineno-1-13"></a> <span class="n">license_number</span><span class="p">:</span> <span class="nb">str</span>
20002000
</code></pre></div>
20012001
<p>The contract structure is automatically added to the prompt, helping the model understand the expected document structure.</p>
20022002
<h2 id="classification-response">Classification Response<a class="headerlink" href="#classification-response" title="Permanent link">&para;</a></h2>
20032003
<p>All classifications return a standardized response:</p>
2004-
<div class="highlight"><pre><span></span><code><a id="__codelineno-2-1" name="__codelineno-2-1" href="#__codelineno-2-1"></a><span class="kn">from</span><span class="w"> </span><span class="nn">typing</span><span class="w"> </span><span class="kn">import</span> <span class="n">Optional</span>
2005-
<a id="__codelineno-2-2" name="__codelineno-2-2" href="#__codelineno-2-2"></a><span class="kn">from</span><span class="w"> </span><span class="nn">pydantic</span><span class="w"> </span><span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Field</span>
2004+
<div class="highlight"><pre><span></span><code><a id="__codelineno-2-1" name="__codelineno-2-1" href="#__codelineno-2-1"></a><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Optional</span>
2005+
<a id="__codelineno-2-2" name="__codelineno-2-2" href="#__codelineno-2-2"></a><span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Field</span>
20062006
<a id="__codelineno-2-3" name="__codelineno-2-3" href="#__codelineno-2-3"></a>
2007-
<a id="__codelineno-2-4" name="__codelineno-2-4" href="#__codelineno-2-4"></a><span class="k">class</span><span class="w"> </span><span class="nc">ClassificationResponse</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
2007+
<a id="__codelineno-2-4" name="__codelineno-2-4" href="#__codelineno-2-4"></a><span class="k">class</span> <span class="nc">ClassificationResponse</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
20082008
<a id="__codelineno-2-5" name="__codelineno-2-5" href="#__codelineno-2-5"></a> <span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
20092009
<a id="__codelineno-2-6" name="__codelineno-2-6" href="#__codelineno-2-6"></a> <span class="n">confidence</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span>
20102010
<a id="__codelineno-2-7" name="__codelineno-2-7" href="#__codelineno-2-7"></a> <span class="n">description</span><span class="o">=</span><span class="s2">&quot;From 1 to 10. 10 being the highest confidence&quot;</span><span class="p">,</span>

core-concepts/classification/index.html

+3-3
Original file line numberDiff line numberDiff line change
@@ -1901,10 +1901,10 @@ <h2 id="classification-techniques">Classification Techniques<a class="headerlink
19011901

19021902
<h2 id="classification-response">Classification Response<a class="headerlink" href="#classification-response" title="Permanent link">&para;</a></h2>
19031903
<p>All classification methods return a standardized response:</p>
1904-
<div class="highlight"><pre><span></span><code><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="kn">from</span><span class="w"> </span><span class="nn">typing</span><span class="w"> </span><span class="kn">import</span> <span class="n">Optional</span>
1905-
<a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="kn">from</span><span class="w"> </span><span class="nn">pydantic</span><span class="w"> </span><span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Field</span>
1904+
<div class="highlight"><pre><span></span><code><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Optional</span>
1905+
<a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="kn">from</span> <span class="nn">pydantic</span> <span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Field</span>
19061906
<a id="__codelineno-0-3" name="__codelineno-0-3" href="#__codelineno-0-3"></a>
1907-
<a id="__codelineno-0-4" name="__codelineno-0-4" href="#__codelineno-0-4"></a><span class="k">class</span><span class="w"> </span><span class="nc">ClassificationResponse</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
1907+
<a id="__codelineno-0-4" name="__codelineno-0-4" href="#__codelineno-0-4"></a><span class="k">class</span> <span class="nc">ClassificationResponse</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
19081908
<a id="__codelineno-0-5" name="__codelineno-0-5" href="#__codelineno-0-5"></a> <span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
19091909
<a id="__codelineno-0-6" name="__codelineno-0-6" href="#__codelineno-0-6"></a> <span class="n">confidence</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span> <span class="o">=</span> <span class="n">Field</span><span class="p">(</span>
19101910
<a id="__codelineno-0-7" name="__codelineno-0-7" href="#__codelineno-0-7"></a> <span class="n">description</span><span class="o">=</span><span class="s2">&quot;From 1 to 10. 10 being the highest confidence&quot;</span><span class="p">,</span>

core-concepts/classification/mom/index.html

+2-2
Original file line numberDiff line numberDiff line change
@@ -1936,8 +1936,8 @@
19361936
<h1 id="mixture-of-models-mom">Mixture of Models (MoM)<a class="headerlink" href="#mixture-of-models-mom" title="Permanent link">&para;</a></h1>
19371937
<p>The Mixture of Models (MoM) is a pattern that increases classification confidence by combining multiple models in parallel. This approach is particularly effective when using instructor for structured outputs.</p>
19381938
<h2 id="basic-usage">Basic Usage<a class="headerlink" href="#basic-usage" title="Permanent link">&para;</a></h2>
1939-
<div class="highlight"><pre><span></span><code><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="kn">from</span><span class="w"> </span><span class="nn">extract_thinker</span><span class="w"> </span><span class="kn">import</span> <span class="n">Process</span><span class="p">,</span> <span class="n">Classification</span><span class="p">,</span> <span class="n">ClassificationStrategy</span>
1940-
<a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="kn">from</span><span class="w"> </span><span class="nn">extract_thinker.document_loader</span><span class="w"> </span><span class="kn">import</span> <span class="n">DocumentLoaderTesseract</span>
1939+
<div class="highlight"><pre><span></span><code><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="kn">from</span> <span class="nn">extract_thinker</span> <span class="kn">import</span> <span class="n">Process</span><span class="p">,</span> <span class="n">Classification</span><span class="p">,</span> <span class="n">ClassificationStrategy</span>
1940+
<a id="__codelineno-0-2" name="__codelineno-0-2" href="#__codelineno-0-2"></a><span class="kn">from</span> <span class="nn">extract_thinker.document_loader</span> <span class="kn">import</span> <span class="n">DocumentLoaderTesseract</span>
19411941
<a id="__codelineno-0-3" name="__codelineno-0-3" href="#__codelineno-0-3"></a>
19421942
<a id="__codelineno-0-4" name="__codelineno-0-4" href="#__codelineno-0-4"></a><span class="c1"># Define classifications</span>
19431943
<a id="__codelineno-0-5" name="__codelineno-0-5" href="#__codelineno-0-5"></a><span class="n">classifications</span> <span class="o">=</span> <span class="p">[</span>

0 commit comments

Comments
 (0)