Merge pull request #9 from prajjwal1/meta

code clean up; logo design
prajjwal1 · Jul 11, 2020 · 12f1de4 · 12f1de4
2 parents 90e2809 + 3ffcc81
commit 12f1de4
Show file tree

Hide file tree

Showing 3 changed files with 11 additions and 43 deletions.
diff --git a/README.md b/README.md
@@ -1,11 +1,16 @@
-# Fluence
-> Fluence is a Pytorch based deep learning library focussed on providing computationally efficient, low resource methods and algorithms. Although the main focus is to provide support with transformers for NLP tasks, it can be extended with other domains and architectures as well. Currently in pre-alpha stage.
+![Fluence Logo](https://github.com/prajjwal1/fluence/blob/master/docs/logo.png)
+-------------------------------------------------------------------------------
+
+Fluence is a Pytorch based deep learning library focussed on providing computationally efficient, low resource methods and algorithms for NLP. Although the main focus is to provide support with transformers for NLP tasks, it can be extended with other domains and architectures as well. Currently in pre-alpha stage.
 
 
 ![badge](https://github.com/prajjwal1/fluence/workflows/build/badge.svg)
 [![PyPI version](https://badge.fury.io/py/fluence.svg)](https://badge.fury.io/py/fluence)
 
-# Installing
+- [Installation](#installing)
+- [Overview](#overview)
+
+## Installing
 For stable (recommended) version:
 ```bash
 pip3 install --user fluence
@@ -18,24 +23,20 @@ cd fluence
 python3 setup.py install --user
 ```
 
+## Overview
 The library contains implementation for the following approaches (many more to come):
 - [Adaptive Methods](https://github.com/prajjwal1/fluence/wiki/Importance-sampling)
     - [Adaptive Attention Span in Transformers](https://arxiv.org/abs/1905.07799)
     - [Adaptively Sparse Transformers](https://arxiv.org/abs/1909.00015)
     - [Reducing Transformer Depth on Demand with Structured Dropout](https://arxiv.org/abs/1909.11556)
-
 - [Meta Learning](https://github.com/prajjwal1/fluence/wiki/Meta-Learning)
-
 - [Optimizers](https://github.com/prajjwal1/fluence/wiki/Optimizers): 
     - [Lamb](https://arxiv.org/abs/1904.00962)
     - [Lookahead](https://arxiv.org/abs/1907.08610)
-
 - [Importance Sampling](https://github.com/prajjwal1/fluence/wiki/Importance-sampling):
-    - Clustering
-
 - [Siamese Transformers](https://github.com/prajjwal1/fluence/wiki/Siamese-Transformers)
 
-### Documentation 
+## Documentation 
 Please head to this [link](https://github.com/prajjwal1/fluence/wiki) to learn how you can integrate fluence with your workflow. Since it's an early release, there might be bugs here and there. Please file an issue if you encounter one.
 
 ### Contribution

diff --git a/docs/logo.png b/docs/logo.png
diff --git a/fluence/models/siamese_model.py b/fluence/models/siamese_model.py
@@ -7,19 +7,6 @@
 logger = logging.getLogger(__name__)
 
 
-class PredictionHeadTransform(nn.Module):
-    def __init__(self, config):
-        super().__init__()
-        self.pool = nn.AdaptiveAvgPool2d((8, 128))
-        self.dense = nn.Linear(4096, len(config.id2label))
-
-    def forward(self, features):
-        features = self.pool(features)
-        features = features.view(features.shape[0] // 4, -1)
-        features = self.dense(features)
-        return features
-
-
 class SiameseTransformer(nn.Module):
     def __init__(self, args, config):
         super(SiameseTransformer, self).__init__()
@@ -31,31 +18,11 @@ def __init__(self, args, config):
             self.args.model_name, config=config, cache_dir=self.args.cache_dir
         )
 
-        self.loss_fct = nn.CrossEntropyLoss()
-        # self.cls = PredictionHeadTransform(config)
-        # self.cls = nn.Linear(len(config.id2label), len(config.id2label))
-        # if self.args.freeze_a:
-        #    logger.info("**** Freezing Model A ****")
-        #    for param in self.model_a.encoder.parameters():
-        #        param.requires_grad = False
-
-        # if self.args.freeze_b:
-        #    logger.info("**** Freezing Model B ****")
-        #    for param in self.model_b.encoder.parameters():
-        #        param.requires_grad = False
-
     def forward(self, a, b):
-        # labels = input_a['labels']
-        # input_a.pop('labels')
-        # input_b.pop('labels')
-        output_a = self.model_a(**a)  # [bs, seq_len, 768]
+        output_a = self.model_a(**a)
         output_b = self.model_b(**b)
         outputs = []
         for i in range(len(output_a)):
             outputs.append(output_a[i] + output_b[i])
 
-        # concat_output = torch.cat([output_a[1], output_b[1]])
-        # logits = self.cls(concat_output)
-        # outputs.append(logits)
-        # loss = self.loss_fct(logits, labels)
         return outputs