forked from pboling/each_in_batches
-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
executable file
·116 lines (81 loc) · 4.55 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
Introduction
================
From BolingForBatches:
I often need to execute really large computations on really large data sets.
I usually end up writing a rake task to do it, which calls methods in my models.
But something about the process bugged me. Each time I had to re-implement my
'batching code' that allowed me to not chew up GB after GB of memory due to
klass.find(:all, :include => [:everything_under_the_sun]). Re-implementation of
the same logic over and over across many projects is not very DRY, so I got out
my blow torch and lit it up. The difficulty was that the part that was different
each time I batched was at the center of the code, right in the middle of the
batch loop. But I didn't let that stop me!
EachInBatches:
I needed to iterate over the results and perform more actions than a single
method would provide. I didn't want to write a method in my app that performed
the needed functionality as I felt the plugin should support this directly.
I modified the original plugin so that it takes a block instead of a method.
It will pass the object instance to the block. It works pretty much the same
as Class.find(:all).each {|x| do something}, except in batches n that you
specify with :batch_size.
Installation
============
./script/plugin install git://github.com/wickkidd/each_in_batches.git
Example
=======
To create a new Batch, call Batch#new pass it the class and any additional arguements (all as a hash).
batch = EachInBatches::Batch.new(:klass => Payment, :select => \"DISTINCT transaction_id\", :batch_size => 50, :order => 'transaction_id')
To process the batched data, pass a block to Batch#run the same way you would to an object returned by Class.find(:all).each.
Batch#run will pass the data to your block, one at a time, in batches set by the :batch_size arguement.
batch.run {|x| puts x.id; puts x.transaction_id}
Print the results!
batch.print_results
Or...
Consolidate your code if you prefer
EachInBatches::Batch.new(:klass => Payment, :select => \"DISTINCT transaction_id\", :batch_size => 50, :order => 'transaction_id', :show_results => true).run{|x| puts x.id; puts x.transaction_id}
Configuration
=============
Arguements for the initializer (Batch.new) method are:
Required:
:klass - Usage: :klass => MyClass
Required, as this is the class that will be batched
Optional:
:include - Usage: :include => [:assoc]
Optional
:select - Usage: :select => "DISTINCT field_name"
or
:select => "field1, field2, field3"
:order - Usage: :order => "field DESC"
:conditions - Usage: :conditions => ["field1 is not null and field2 = ?", x]
:verbose - Usage: :verbose => true or false
Sets verbosity of output
Default: false (if not provided)
:batch_size - Usage: :batch_size => x
Where x is some number.
How many AR Objects should be processed at once?
Default: 50 (if not provided)
:last_batch - Usage: :last_batch => x
Where x is some number.
Only process up to and including batch #x.
Batch numbers start at 0 for the first batch.
Default: won't be used (no limit if not provided)
:first_batch - Usage: first_batch => x
Where x is some number.
Begin processing batches beginning at batch #x.
Batch numbers start at 0 for the first batch.
Default: won't be used (no offset if not provided)
:show_results - Usage: :show_results => true or false
Prints statistics about the results of Batch#run.
Default: true if verbose is set to true and :show_results is not provided, otherwise false
Output
======
Interpreting the output:
'[O]' means the batch was skipped due to an offset.
'[L]' means the batch was skipped due to a limit.
'[P]' means the batch is processing.
'[C]' means the batch is complete.
and yes... it was a coincidence. This class is not affiliated with 'one laptop per child'
License
=======
Copyright (c) 2008 Peter H. Boling, released under the MIT license
Or in other words have fun, and don't blame me!