Skip to content

Commit f35fa1d

Browse files
committed
push 774M model
1 parent cb41537 commit f35fa1d

File tree

7 files changed

+20
-15
lines changed

7 files changed

+20
-15
lines changed

DEVELOPERS.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,9 @@ pip3 install -r requirements.txt
2727

2828
Download the model data
2929
```
30-
python3 download_model.py 117M
31-
python3 download_model.py 345M
30+
python3 download_model.py 124M
31+
python3 download_model.py 355M
32+
python3 download_model.py 774M
3233
```
3334

3435
## Docker Installation

Dockerfile.cpu

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,6 @@ RUN mkdir /gpt-2
55
WORKDIR /gpt-2
66
ADD . /gpt-2
77
RUN pip3 install -r requirements.txt
8-
RUN python3 download_model.py 117M
9-
RUN python3 download_model.py 345M
8+
RUN python3 download_model.py 124M
9+
RUN python3 download_model.py 355M
10+
RUN python3 download_model.py 774M

Dockerfile.gpu

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,5 +14,6 @@ RUN mkdir /gpt-2
1414
WORKDIR /gpt-2
1515
ADD . /gpt-2
1616
RUN pip3 install -r requirements.txt
17-
RUN python3 download_model.py 117M
18-
RUN python3 download_model.py 345M
17+
RUN python3 download_model.py 124M
18+
RUN python3 download_model.py 355M
19+
RUN python3 download_model.py 774M

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@
44

55
Code from the paper ["Language Models are Unsupervised Multitask Learners"](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf).
66

7-
We have currently released small (117M parameter) and medium (345M parameter) versions of GPT-2. While we have not released the larger models, we have [released a dataset](https://github.com/openai/gpt-2-output-dataset) for researchers to study their behaviors.
7+
We have currently released small (124M parameter), medium (355M parameter), and large (774M parameter) versions of GPT-2<sup>*</sup>, with only the full model as of yet unreleased. We have also [released a dataset](https://github.com/openai/gpt-2-output-dataset) for researchers to study their behaviors.
88

9-
See more details in our [blog post](https://blog.openai.com/better-language-models/).
9+
You can read about GPT-2 and release decisions in our [original blog post](https://blog.openai.com/better-language-models/) and [6 month follow-up post](https://openai.com/blog/gpt-2-6-month-follow-up/).
10+
11+
<sup>*</sup> *Note that our original parameter counts were wrong due to an error (in our previous blog posts and paper). Thus you may have seen small referred to as 117M and medium referred to as 345M.*
1012

1113
## Usage
1214

download_model.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
from tqdm import tqdm
55

66
if len(sys.argv) != 2:
7-
print('You must enter the model name as a parameter, e.g.: download_model.py 117M')
7+
print('You must enter the model name as a parameter, e.g.: download_model.py 124M')
88
sys.exit(1)
99

1010
model = sys.argv[1]

src/generate_unconditional_samples.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
import model, sample, encoder
1010

1111
def sample_model(
12-
model_name='117M',
12+
model_name='124M',
1313
seed=None,
1414
nsamples=0,
1515
batch_size=1,
@@ -20,7 +20,7 @@ def sample_model(
2020
):
2121
"""
2222
Run the sample_model
23-
:model_name=117M : String, which model to use
23+
:model_name=124M : String, which model to use
2424
:seed=None : Integer seed for random number generators, fix seed to
2525
reproduce results
2626
:nsamples=0 : Number of samples to return, if 0, continues to

src/interactive_conditional_samples.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,18 +9,18 @@
99
import model, sample, encoder
1010

1111
def interact_model(
12-
model_name='117M',
12+
model_name='124M',
1313
seed=None,
1414
nsamples=1,
1515
batch_size=1,
1616
length=None,
1717
temperature=1,
1818
top_k=0,
19-
models_dir='models',
19+
models_dir='models',
2020
):
2121
"""
2222
Interactively run the model
23-
:model_name=117M : String, which model to use
23+
:model_name=124M : String, which model to use
2424
:seed=None : Integer seed for random number generators, fix seed to reproduce
2525
results
2626
:nsamples=1 : Number of samples to return total
@@ -36,7 +36,7 @@ def interact_model(
3636
while 40 means 40 words are considered at each step. 0 (default) is a
3737
special setting meaning no restrictions. 40 generally is a good value.
3838
:models_dir : path to parent folder containing model subfolders
39-
(i.e. contains the <model_name> folder)
39+
(i.e. contains the <model_name> folder)
4040
"""
4141
models_dir = os.path.expanduser(os.path.expandvars(models_dir))
4242
if batch_size is None:

0 commit comments

Comments
 (0)